⚠️ Deprecated ⚠️

This feature is deprecated and will be removed in the future.

It is not recommended for use.

  • Import from "@langchain/community/document_loaders/web/github" instead. This entrypoint will be removed in 0.3.0.

A class that extends the BaseDocumentLoader and implements the GithubRepoLoaderParams interface. It represents a document loader for loading files from a GitHub repository.

Hierarchy (view full)

Implements

Constructors

Properties

apiUrl: string

The API endpoint URL of the GitHub instance. To be used when you are not targeting github.com, e.g. a GitHub Enterprise instance.

baseUrl: string

The base URL of the GitHub instance. To be used when you are not targeting github.com, e.g. a GitHub Enterprise instance.

branch: string
ignoreFiles: (string | RegExp)[]
processSubmodules: boolean

Set to true to recursively process submodules. Is only effective, when recursive=true.

recursive: boolean
accessToken?: string
ignore?: Ignore
ignorePaths?: string[]
maxConcurrency?: number

The maximum number of concurrent calls that can be made. Defaults to 2.

maxRetries?: number

The maximum number of retries that can be made for a single call, with an exponential backoff between each attempt. Defaults to 2.

verbose?: boolean
caller: AsyncCaller

Methods

  • Fetches the files from the GitHub repository and creates Document instances for each file. It also handles error handling based on the unknown handling option.

    Returns Promise<Document[]>

    A promise that resolves to an array of Document instances.

  • Asynchronously streams documents from the entire GitHub repository. It is suitable for situations where processing large repositories in a memory-efficient manner is required.

    Returns AsyncGenerator<Document, void, undefined>

    Yields

    Yields a Promise that resolves to a Document object for each file or submodule content found in the repository.

  • Determines whether a file or directory should be ignored based on its path and type.

    Parameters

    • path: string

      The path of the file or directory.

    • fileType: string

      The type of the file or directory.

    Returns boolean

    A boolean indicating whether the file or directory should be ignored.