Skip to main content

SharePoint Data Connector

The BigPanda Unified Data Connector (UDC) syncs SharePoint list metadata through the Microsoft Graph API to provide context and insights for AI Incident Assistant (Biggy), AI Incident Prevention, and AI Detection and Response. 

Ingested data is securely stored and made available in the IT Knowledge Graph, powering analytics, trend analysis, and downstream operations workflows.

Metadata only sync

The SharePoint connector ingests list item metadata and selected scalar list columns. It does not move document contents, attachments, page HTML, or binary payloads from document libraries into BigPanda.

Rows from document libraries may appear as list items (for example, file name, modified date, URL), but the connector does not download file bodies.

Non-scalar column types (lookups, person/group, multi-value choice, etc.) may be omitted or normalized depending on how Microsoft Graph represents them. Use internal SharePoint column names, not display labels.

When to use this connector

  • SharePoint lists are the source of truth (change calendars, KB indexes, runbook trackers, approval lists, etc.).

  • You need scheduled, incremental sync into the IT Knowledge Graph.

  • Linking to SharePoint items via webUrl is sufficient; file bodies are not required in BigPanda.

When not to use this connector

  • Primary knowledge lives in document library files (Word, PDF, etc.).

  • You need wiki page HTML or attachment binaries in BigPanda.

  • Knowledge is in Confluence or ServiceNow KB — use those connectors instead.

  • SharePoint on-premises farms (non-Microsoft 365) — not supported via this Graph-based connector.

Authentication

The SharePoint connector uses OAuth 2.0 client credentials against Microsoft Entra ID (Azure AD) and calls Microsoft Graph. BigPanda refreshes credentials before sync requests, so scheduled runs continue without manual re-authorization. Auth strategy cannot be changed when editing an existing connection.

Microsoft Graph prerequisites

Before BigPanda can configure the connector, your organization must register an application in Microsoft Entra ID that grants the BigPanda application (not delegated) read access to the SharePoint list data you want to sync. The application requires the following admin-consented Microsoft Graph application permission:

  •  Sites.Read.All 

Provide the client ID, the client secret, and the following connection settings to your BigPanda account team, who will complete the authorization and set up the connector.

Setting

Value

instance_url

https://graph.microsoft.com

instance_url is always the Graph API base URL. The SharePoint site path is configured separately on the pipeline as site_url.

Auth strategy

OAuth 2.0 client credentials

Application (client) ID

From your Entra app registration

Client secret

The secret Value (not the Secret ID)

oauth2_token_url

https://login.microsoftonline.com/<tenant_id>/oauth2/v2.0/token

scope

https://graph.microsoft.com/.default (used by default when omitted)

Configure the SharePoint connector

Provide the following configuration to your BigPanda account team. The connector creates one output table per configured list.

Field

Required

Default

Description

site_url

Yes

SharePoint site path in Graph format: hostname:/sites/....

For example: contoso.sharepoint.com:/sites/MySite. Do not include https://.

lists

Yes

Map of output table name and list definition. Each list produces one output table.

list_id

Yes

Within each lists entry, the Microsoft Graph list GUID.

field_names

No

All columns

Within each lists entry, the specific list columns to include, using their internal SharePoint names. Omit this field to load all list columns.

start_date

Yes 

The start of the sync window, in YYYY-MM-DD format.

Sets the initial sync window and incremental cursor baseline. Required when creating a pipeline. Must be today or earlier.

cron_schedule

Yes

Cron expression for scheduled sync runs (for example, every 15 minutes).

timezone

No

UTC

Timezone for schedule interpretation.

page_size

No

100

Items per Graph request ($top). Maximum 999.

rate_limit

No

20

Maximum requests per minute.

rate_limit_timeout_ms

No

1000

Milliseconds to wait when the local rate limiter throttles requests.

request_timeout

No

60

Seconds before a Graph request times out.

Example configuration

{
  "cron_schedule": "*/15 * * * *",
  "start_date": "2024-01-01",
  "timezone": "UTC",
  "site_url": "contoso.sharepoint.com:/sites/MySite",
  "lists": {
    "tasks": {
      "list_id": "00000000-0000-0000-0000-000000000001",
      "field_names": ["Title", "Status", "Priority"]
    }
  },
  "page_size": 100,
  "rate_limit": 20,
  "rate_limit_timeout_ms": 1000,
  "request_timeout": 60
}

Finding a list GUID

In SharePoint, open the List > Settings > List settings. The list GUID appears in the URL as List=%7B<guid>%7D, or use Microsoft Graph to enumerate lists for the site.

Output schema

The connector creates one output table per lists entry. Each table uses id as its primary key and Modified as its sync cursor. Each record includes the following metadata along with any selected scalar list columns.

Field

Description

id

Unique identifier of the list item. This is the primary key for the table.

createdDateTime

The date and time the item was created.

lastModifiedDateTime

The date and time the item was last modified, as reported by Microsoft Graph.

Modified

The SharePoint Modified column. The connector uses this column as the sync cursor.

webUrl

A link to the item in SharePoint.

Selected scalar list columns

Any scalar columns you name in field_names for the list. Only scalar values are included. When omitted, all field_names are included.

Sync behavior

Ongoing sync is incremental based on the SharePoint Modified column.

  1. Initial / backfill: On the first run (or after a cursor reset), the connector loads items with Modified on or after start_date.

  2. Subsequent runs: The connector loads items with Modified greater than or equal to the stored cursor from the previous successful run.

  3. Scheduling: cron_schedule controls how often incremental runs execute.

Filtering uses SharePoint modified column

The connector filters on the SharePoint Modified field rather than the Graph lastModifiedDateTime field, because Microsoft Graph does not support filtering on lastModifiedDateTime for list items.

Request and performance controls

You can tune the following controls to manage paging and request behavior.

Control

Default

Description

page_size

100

The number of items requested from Microsoft Graph in a single call. This maps to the Graph $top page size.

rate_limit

20

The maximum number of requests sent per minute.

rate_limit_timeout_ms

100

Wait time when the connector’s rate limiter throttles outbound requests.

request_timeout

60

How long the connector waits, in seconds, before timing out a request.

When Microsoft Graph returns HTTP 429, the connector honors the Retry-After header before retrying.

Troubleshooting

If a sync run fails, review the items below.

Symptom

What to check

Authentication failures

Entra app client ID, client secret Value, Sites.Read.All application permission, and admin consent are valid. Token URL uses the correct tenant_id.

Wrong API endpoint

Connection instance_url is https://graph.microsoft.com, not the SharePoint site hostname. Site path belongs in pipeline site_url.

Site resolution failures

site_url uses Graph site path format (hostname:/sites/...) without https://. Site exists and the app can read it.

List not found / empty table

list_id is the correct Graph list GUID for that site. App has read access to the site and list.

Missing or wrong columns

field_names use internal SharePoint column names (for example Title, not a display label).

Rate limiting (429)

Lower rate_limit or increase rate_limit_timeout_ms. Connector already respects Graph Retry-After.

Unexpected date range

start_date is YYYY-MM-DD, not in the future, and reflects the backfill window you intend.

No file contents

This is expected. The connector syncs metadata only, not document binaries.

FAQs

Why does the connection use graph.microsoft.com instead of our SharePoint site URL?

The connection authenticates to Microsoft Graph. The SharePoint site is configured separately on the pipeline as site_url.

Can this connector ingest Word or PDF files from document libraries?

No. It syncs list item metadata only. File name and URL may appear for library rows, but file bodies are not downloaded.

Do we need Files.Read.All?

No for this connector. Sites.Read.All (application) is sufficient for list metadata.