Recent Chats
Sign in to access your chat history.
Following hte upgrade of Sourcebot to v4.16.14, I'm now seeing the following errors in the Kubernetes pod logs:
Suggest a cause
Details
Answer
podSecurityContext and the /data VolumeThe EACCES: permission denied error on /data/.sourcebot/repos/131 is caused by a file-system ownership conflict between the Kubernetes pod's security context and the persistent volume that was already initialised under a different UID/GID.
1. The Dockerfile creates a sourcebot user with UID/GID 1500
In the runner stage of the Dockerfile Dockerfile:196-208, the image creates the sourcebot user with UID 1500 and GID 1500, and then runs chown -R sourcebot /data Dockerfile:264-265. Critically, there is no USER directive — the comment explicitly says the container can run as root or as the sourcebot user depending on how it is launched Dockerfile:273-275.
2. The Helm chart's default podSecurityContext enforces UID/GID 1500 with runAsNonRoot: true
The Helm chart's default values enforce running as the non-root sourcebot user values.yaml:117-122:
3. The repo data path is resolved directly from DATA_CACHE_DIR
getRepoPath() constructs the path /data/.sourcebot/repos/<id> directly from env.DATA_CACHE_DIR utils.ts:104-120. The web process then calls stat on this path (via getFileSourceForRepo) to read repo content for the new review agent feature getFileSourceApi.ts:18-30.
4. The new review agent feature in v4.16.14 introduced getFileSourceForRepo
The v4.16.14 release introduced direct file-system access via the review agent's fetchFileContent.ts fetchFileContent.ts:1-41. It now calls getFileSourceForRepo (bypassing the session auth layer), which in turn calls simpleGit().cwd(repoPath) and stats the repo path directly on disk.
The most common scenario in Kubernetes is:
/data) was originally populated when the container ran as root (UID 0), or under a different UID/GID than 1500. The files under /data/.sourcebot/repos/ are therefore owned by that old UID.0.1.79), the pod now runs with runAsUser: 1500 enforced by the podSecurityContext values.yaml:118-122, and fsGroup: 1500 is only applied to newly created files, not to pre-existing ones already on the volume.stat on /data/.sourcebot/repos/131, which is owned by a different UID — hence EACCES.The Dockerfile note that the USER directive is intentionally omitted Dockerfile:273-275 means earlier deployments (docker-compose, or Helm charts before podSecurityContext was defaulted to non-root) likely ran as root, creating the repo data as root-owned files that UID 1500 now cannot read.
Option A — Chown the volume data (recommended one-time fix):
Run a one-off init container or job to chown -R 1500:1500 /data on the persistent volume before the pod starts. This aligns the existing data ownership with the new UID.
You can add this via sourcebot.initContainers in the Helm values deployment.yaml:37-40.
Option B — Override podSecurityContext to run as root (temporary workaround):
Set podSecurityContext: {} or runAsUser: 0 in your Helm values to restore root execution, matching how the data was originally written. This trades security for compatibility.
Option C — Override the podSecurityContext to match the UID that actually owns the data:
If you know the original UID (e.g., 0 for root), set runAsUser to match and remove runAsNonRoot: true.
Previous version is 4.16.2. When did the podSecurityContext change?
Details
Answer
The podSecurityContext default was changed from an empty object ({}) to enforcing UID/GID 1500 with runAsNonRoot: true in Helm chart version 0.1.4, introduced by commit a59b763 on 5 November 2025, titled "add default pod security context".
That said, this is not what changed between your two app versions. Both v4.16.2 and v4.16.14 shipped with the same podSecurityContext defaulting to UID/GID 1500 values.yaml:96-100 — that change was made long before either of those releases. The specific timeline relevant to you is:
| Helm Chart Version | App Version | Date |
|---|---|---|
0.1.4 | v4.8.1 | 5 Nov 2025 |
0.1.65 | v4.16.2 | 24 Mar 2026 |
0.1.79 | v4.16.14 | 21 Apr 2026 |
The exact diff that introduced the change in values.yaml was values.yaml:95-100:
This means if you were running the Helm chart prior to 0.1.4 (i.e., before 5 Nov 2025) and your PV was populated at that time, the data on disk would have been written as root (UID 0), and every Helm chart version since 0.1.4 — including both 0.1.65/v4.16.2 and 0.1.79/v4.16.14 — has run as UID 1500. The permission error surfacing now between v4.16.2 → v4.16.14 is therefore more likely explained by the new review-agent code path that started directly stat-ing files under /data/.sourcebot/repos/ fetchFileContent.ts:1-41, hitting ownership mismatches that earlier code paths never exercised.
OK, so that's not my issue then as previous version was 4.16.2... What could have cause the observed error between those 2 versions?
Details
Answer
The EACCES: permission denied, stat '/data/.sourcebot/repos/131' error between v4.16.2 and v4.16.14 is caused by two distinct code changes introduced in that window:
getFileSourceForRepo now does a direct stat of the repo directory (introduced in v4.16.8 / PR #1076)In v4.16.2, the review agent's fetchFileContent node called getFileSource, which went through the full withOptionalAuth HTTP-style auth layer. In v4.16.8, PR #1076 refactored the web routes to remove the org domain prefix from URLs and, as part of this, introduced getFileSourceForRepo getFileSourceApi.ts:26-46 — a new "privileged" bypass that goes directly to the filesystem without going through auth middleware.
fetchFileContent was updated to call this new function fetchFileContent.ts:38-38:
Inside getFileSourceForRepo, it calls getRepoPath(repo) utils.ts:104-120 which resolves to /data/.sourcebot/repos/<id>, then calls simpleGit().cwd(repoPath) getFileSourceApi.ts:46-46. The cwd() call causes simple-git to stat that directory to verify it exists — producing the exact EACCES: permission denied, stat '/data/.sourcebot/repos/131' error you see.
Previously (in v4.16.2), this code path didn't exist — file content was fetched through the API auth layer which handled these errors gracefully, and simple-git was only invoked after permission checks completed in a different context.
simple-git was bumped from ^3.27.0 to ^3.33.0The simple-git dependency was upgraded package.json:53-55 and package.json:180-182. The newer version may be more eager about validating the cwd path immediately on .cwd() rather than lazily on the first git command, which means a permission error that was previously deferred and silently handled now surfaces immediately as a thrown exception.
When a PR review is triggered (either by the REVIEW_AGENT_AUTO_REVIEW_ENABLED flag or the /review command), the review agent node at fetchFileContent.ts fetchFileContent.ts:25-54 tries to read file contents by calling getFileSourceForRepo, which attempts to stat /data/.sourcebot/repos/131 — a directory that the web process user (UID 1500) cannot access.
The permission problem on the actual directory content existed before, but the new direct filesystem code path introduced in v4.16.8 is what exposes it. Previously the review agent used OPENAI_API_KEY directly and the fetch went through a different, safer code path invokeDiffReviewLlm.ts:1-10 (the old version hardcoded OpenAI; the new version uses the configured model via getAISDKLanguageModelAndOptions, and the file fetch now bypasses auth).
Run chown -R 1500:1500 /data/.sourcebot/repos on the Persistent Volume (via an init container or a one-off job) so the UID 1500 web process can stat and read the repo directories that the review agent now accesses directly.
Directory permissions are correct:
Details
Answer
Good catch — the permissions are fine, so this is a different root cause. Looking at the diff between v4.16.2 and v4.16.14, there are two distinct code changes that together explain this error.
simple-git bumped from 3.27.0 → 3.33.0 (shipped in v4.16.5)PR #1078 bumped simple-git package.json:53-55 and package.json:180-182.
In simple-git 3.33.0, the library validates the cwd path eagerly at the time .cwd() is called, rather than lazily on the first git command. The {"commands":[],"format":"empty"} in your error is simple-git's internal representation of a failed pre-command task — it throws before any git command is even issued, during the directory stat check. In 3.27.0 this check was deferred (or absent), so it never surfaced.
getFileSourceForRepo added as a new direct filesystem entry point (shipped in v4.16.8)PR #1076 (Remove org domain from routes) extracted getFileSourceForRepo as a new exported function getFileSourceApi.ts:26-46:
In v4.16.2, the equivalent code only existed inside getFileSource wrapped in sew(() => withOptionalAuthV2(...)) getFileSourceApi.ts:20-96. The new getFileSourceForRepo function is called directly without the sew() wrapper, so any exception thrown by simpleGit().cwd() is not caught by the service error wrapper and propagates as an unhandled fatal exception.
The stat itself is succeeding for all the repos you can see (100–105). The error on repo 131 means one of two things:
Repo 131 is a newly added repo that exists in the DB but has not yet been cloned to disk — the indexing job is queued but hasn't run. The new getFileSourceForRepo path (e.g., triggered by an API call or the MCP read_file tool) attempts to stat a directory that doesn't exist yet. With simple-git 3.27, this would have been ENOENT; with 3.33 the error presentation changed. However, the error says EACCES, not ENOENT.
More likely: The indexer process (running as a different user or in a separate container) created the /data/.sourcebot/repos/131 directory but with mode 700 or owned by root before the clone completed, and the web process (UID 1500) cannot stat it. This can happen when the backend creates the directory with fs.mkdirSync before calling git clone — the directory exists but is not yet world-readable. The EACCES is transient, occurring in the brief window between directory creation and clone completion.
fetchFileContent in the review agent fetchFileContent.ts:38-38 (added in v4.16.8) calls getFileSourceForRepo directlygetFileSourceForRepo calls simpleGit().cwd(repoPath) getFileSourceApi.ts:46-46simple-git 3.33 eagerly validates the path and throws EACCES when it can't stat /data/.sourcebot/repos/131sew() wrapper around getFileSourceForRepo, the exception propagates as a fatal unhandled error through z.onFatalException / z.attemptTask (the Next.js server task runner), producing the exact stack trace you seegetFileSourceForRepo in a try/catch that maps EACCES/ENOENT to a notFound() or unexpectedError() service error, consistent with how the existing getFileSource wraps things in sew().fs.existsSync(repoPath) guard before calling simpleGit().cwd(repoPath) in getFileSourceForRepo getFileSourceApi.ts:45-46 to return a graceful notFound() instead of throwing.Repo 131 _doesexist on the filesystem with the correct permissions. It was just missed on the previous example due to the use ofhead -n 10`
Details
Answer
Now I have the full picture. The root cause is a single breaking change introduced in v4.16.8 (PR #1076 - "Remove org domain from URLs") that reorganised the internal architecture and introduced getFileSourceForRepo.
PR #1076 refactored the URL structure and, as part of it, extracted getFileSourceForRepo as a new server-side function getFileSourceApi.ts:26-101. This function calls simpleGit().cwd(repoPath) getFileSourceApi.ts:46-46 directly — with no sew() or withOptionalAuth() wrapper.
At the same time, fetchFileContent.ts was updated to call getFileSourceForRepo fetchFileContent.ts:38-38 instead of the old getFileSource.
The critical bug is in how fetchFileContent constructs the repo name to look up in the database:
fetchFileContent.ts:30-34
For a GitHub PR, this produces something like github.com/my-org/my-repo. This is then passed to getFileSourceForRepo which does a DB lookup using name: repoName getFileSourceApi.ts:30-35.
If that lookup fails to find a repo (e.g. the repo name doesn't match exactly, or the org lookup fails), notFound() is returned. But getFileSourceForRepo itself is unwrapped — there is no sew() wrapper. If simpleGit().cwd(repoPath) at line 46 throws before even reaching the git command — because the path it resolves doesn't match what simple-git 3.33 expects for a bare repo — the exception propagates up uncaught through the Next.js request handler.
simple-git 3.33 triggers itThe simple-git bump from 3.27.0 → 3.33.0 package.json:53-55 changed how .cwd() validates the path. In 3.33, calling .cwd(repoPath) on a bare git repository (which is what Sourcebot clones — note /data/.sourcebot/repos/131 has no .git subdirectory, just HEAD, objects, etc. directly) triggers an eager validation that tries to stat a path inside the repo directory (specifically looking for a .git folder or verifying it's a git root), and this validation throws EACCES on bare repos in certain edge cases in the newer version.
The {"commands":[],"format":"empty"} in the stack trace getFileSourceApi.ts:46-46 is simple-git's internal representation of a task that failed during initialisation before any git command was issued — confirming it's the .cwd() call itself failing, not a git show command.
In v4.16.2, fetchFileContent called getFileSource which was wrapped in sew(() => withOptionalAuth(...)) getFileSourceApi.ts:103-116. The sew() middleware catches all thrown exceptions and converts them into structured ServiceError objects. In v4.16.14, getFileSourceForRepo has no such wrapper, so the EACCES thrown by simple-git propagates all the way up to z.onFatalException / z.attemptTask in the Next.js server runtime — which is exactly what your stack trace shows.
The error is caused by two changes landing together in v4.16.8:
simple-git 3.33 eager .cwd() validation — throws EACCES when given a bare git repo path in certain conditions package.json:180-182getFileSourceForRepo has no error boundary — unlike the old getFileSource, exceptions are not caught by sew() and propagate as fatal errors to the Next.js runtime getFileSourceApi.ts:26-101The fix is to wrap the simpleGit().cwd(repoPath) call in getFileSourceForRepo with a try/catch that maps the error to a ServiceError (e.g. unexpectedError()), consistent with how git errors are already caught at line 53 getFileSourceApi.ts:51-62.
I've rolled back to 4.16.13 and repos appear to be refreshing as expected. Does the above analysis still ring true?
Details
Answer
Yes, the previous analysis was on the wrong track. The diff between v4.16.13 and v4.16.14 is unambiguous — there is one specific change that causes the error.
v4.16.14 introduced GitLab MR support for the review agent (PR #1104). As part of this, getFileSource was refactored: the core file-fetch logic was extracted into a new getFileSourceForRepo function, and getFileSource was rewritten to simply call it getFileSourceApi.ts:98-116.
The critical structural change is that getFileSourceForRepo sits entirely outside the sew() wrapper getFileSourceApi.ts:20-96:
fetchFileContent was updated to call getFileSourceForRepo directly fetchFileContent.ts:38-38. When simpleGit().cwd(repoPath) throws (for any reason — a transient I/O error, a repo being written to at that moment, etc.), the exception is no longer caught by sew() and propagates as a fatal error through the Next.js server task runner, producing the exact z.onFatalException / z.attemptTask stack trace you see.
Additionally, fetchFileContent was also changed in v4.16.14 to pass ref: pr_payload.head_sha fetchFileContent.ts:33-33 where before ref was undefined. If head_sha refers to a commit that hasn't been fetched yet on the local bare clone (i.e. repo 131 is slightly behind), simple-git will now throw rather than silently falling back to HEAD.
The error is not a permissions issue at all. It is a missing error boundary — getFileSourceForRepo was extracted from inside sew() in v4.16.14 getFileSourceApi.ts:20-33, meaning any exception from simpleGit().cwd() or the git operations inside it now propagates fatally instead of being caught and returned as a structured ServiceError. Rolling back to v4.16.13 restores the original sew()-wrapped code path, which is why everything works again on that version.
This chat is read-only.
