Power BI / Power Query

Step-by-step Power Query setup for pulling GET /2020-01/roleplay/reports/sessions into Power BI: paged retrieval with throttle-aware retry, a session/goal star schema, and native incremental refresh keyed on publishedAt — the publication date — so refreshes can never silently miss a session.

Prerequisites

An API key with Roleplay API access, sent as the X-API-KEY header.
Base URL: https://api.rehearsal.com (production).

Response envelope (what the queries below rely on)

Field	Meaning
`items`	The session records for this page.
`hasMore`	`true` — more records exist beyond this page.
`nextPageToken`	Opaque cursor for the next page — pass it back verbatim.
`totalCount`	Approximate total, present only on the first call (no `pageToken`). Capped at 10000.
`totalCountCapped`	`true` means the total met/exceeded the 10000 cap — display as “10000+”.
`pageSize`	Echo of the effective page size.

With orderBy=publishedAt the final page also returns a nextPageToken — a resume watermark (see the API reference). This guide’s recipe windows by publishedAfter/publishedBefore instead, so the queries below simply stop when hasMore is false and can ignore it.

Other filters: environmentIds / learnerIds accept up to 100 IDs; to filter by learner email, resolve emails first via POST /2020-01/roleplay/lookup/profile-ids. pageSize range is 1–100 (default 10) — use 100 for bulk pulls. Date filters are detailed below.

The two timestamps: `completedAt` and `publishedAt`

Every record carries two UTC timestamps with different jobs:

Timestamp	Meaning	Filters	Use for
`completedAt`	When the learner finished the session	`completedAfter` / `completedBefore`	Business questions (“May’s sessions”), slicing, one-time pulls
`publishedAt`	When the session’s results became visible in this report — its AI analysis succeeded, typically under a minute after completion, occasionally up to ~15 minutes on retries	`publishedAfter` / `publishedBefore` (require `orderBy=publishedAt`)	Replication: incremental refresh, warehouse sync

One boundary rule for all four date filters: *After is inclusive (>=), *Before is exclusive (<), and a date-only value (YYYY-MM-DD) means midnight UTC at the start of that day. Adjacent half-open windows never overlap and never gap — “all of May” is completedAfter=2026-05-01&completedBefore=2026-06-01. Datetimes must carry a timezone (Z or an offset like -04:00); zoneless datetimes are rejected with a 400. (publishedAfter/publishedBefore accept only the datetime form.)

Working in a local timezone: all timestamps are UTC, and the recommended approach is to keep the API window in UTC too — replicate by publishedAt (the incremental-refresh recipe below) and group by local day in your data model, using a date dimension table that maps UTC dates to your local calendar. DST then lives in one maintained table instead of in filter logic. Power Query itself has no IANA timezone support (DateTimeZone.SwitchZone applies a fixed offset, which goes wrong across DST transitions), so local-day logic belongs in the model, not in M.

If you genuinely must limit a business-window pull to one exact local day at the API, send local midnight-to-midnight as offset datetimes — computed by an IANA-aware timezone library, never hardcoded — e.g. May 29 in New York:

completedAfter=2026-05-29T00:00:00-04:00&completedBefore=2026-05-30T00:00:00-04:00

Setting up in Power BI

In Power Query, create the two incremental-refresh parameters: Manage Parameters → New — RangeStart and RangeEnd (exactly those names), type Date/Time, any initial values a few days apart (e.g. 2026-05-01 and 2026-06-01; Desktop uses them only for preview — the Power BI service replaces them per partition).
Create a blank query named SessionsRaw and paste Query 1. Replace YOUR_API_KEY — or better, promote BaseUrl / ApiKey to Power BI parameters.
Right-click SessionsRaw → Enable load: off (it is a base query the others build on).
Create Sessions from Query 2, SessionGoals from Query 3, and SessionPerformanceMetrics from Query 4.
When prompted for credentials, connect as Anonymous (the API key travels in the header).
In the model view, relate SessionGoals[sessionId] → Sessions[sessionId] and SessionPerformanceMetrics[sessionId] → Sessions[sessionId] (many-to-one) for a star schema.
Configure the incremental-refresh policy on all three tables — see Incremental refresh: configuration.

Query 1 — `SessionsRaw` (base query; disable load)

Fetches all pages with throttle-aware retry and flattens to one row per session, keeping the goals and performanceMetrics list columns for Queries 3 and 4. In Power BI: right-click the query → Enable load: off.

let
    // ── Configuration (promote BaseUrl/ApiKey to Power BI parameters) ─────────
    BaseUrl  = "https://api.rehearsal.com",
    ApiKey   = "YOUR_API_KEY",
    PageSize = "100",       // max; fewer requests = fewer throttles

    // RangeStart/RangeEnd are Power BI's incremental-refresh parameters (type
    // Date/Time, exactly those names — created in step 1). Values are treated
    // as UTC. The API window is half-open exactly like Power BI's partition
    // convention: >= RangeStart, < RangeEnd — every record lands in exactly
    // one partition.
    ToApiUtc = (t as datetime) as text =>
        DateTime.ToText(t, [Format = "yyyy-MM-dd'T'HH:mm:ss"]) & "Z",

    // ── One page, with 429-aware retry ─────────────────────────────────────────
    // Retries ONLY on HTTP 429 (throttle). Other statuses fail immediately — a 400
    // never succeeds on retry. Max 5 attempts per page (1 initial + 4 retries).
    // Delay: Retry-After header when present, else exponential 0.5s, 1s, 2s, 4s.
    FetchPage = (pageToken as nullable text, attempt as number) as record =>
        let
            resp = Web.Contents(BaseUrl, [
                RelativePath = "2020-01/roleplay/reports/sessions",
                Query = [
                            pageSize = PageSize,
                            orderBy = "publishedAt",
                            publishedAfter = ToApiUtc(RangeStart),
                            publishedBefore = ToApiUtc(RangeEnd)
                        ]
                        & (if pageToken <> null then [pageToken = pageToken] else []),
                Headers = [#"X-API-KEY" = ApiKey],
                ManualStatusHandling = {429},   // inspect 429 instead of erroring
                IsRetry = attempt > 1           // bypass Power Query's response cache on retry
            ]),
            meta = Value.Metadata(resp),
            status = meta[Response.Status],
            // Honor Retry-After when the service sends one; fall back to exponential.
            headers = Record.FieldOrDefault(meta, "Headers", []),
            retryAfterText = Record.FieldOrDefault(headers, "Retry-After", null),
            retryAfterSec = if retryAfterText <> null then (try Number.From(retryAfterText) otherwise null) else null,
            delaySec = if retryAfterSec <> null and retryAfterSec > 0
                       then retryAfterSec
                       else Number.Power(2, attempt - 1) * 0.5,
            result =
                if status = 429 and attempt < 5 then
                    Function.InvokeAfter(() => @FetchPage(pageToken, attempt + 1), #duration(0, 0, 0, delaySec))
                else if status = 429 then
                    error "Rehearsal API: still throttled (HTTP 429) after 5 attempts"
                else
                    Json.Document(resp)
        in
            result,

    // ── Walk all pages via nextPageToken ───────────────────────────────────────
    // 1s self-throttle between pages keeps a bulk pull comfortably under the
    // gateway rate limit, so the retry path above is rarely exercised.
    AllPages = List.Generate(
        () => FetchPage(null, 1),
        each _ <> null,
        each if _[hasMore]
             then Function.InvokeAfter(() => FetchPage(_[nextPageToken], 1), #duration(0, 0, 0, 1))
             else null
    ),

    // ── Flatten to one row per session ────────────────────────────────────────
    AllItems = List.Combine(List.Transform(AllPages, each _[items])),
    AsTable = Table.FromList(AllItems, Splitter.SplitByNothing(), {"session"}),
    Expanded = Table.ExpandRecordColumn(AsTable, "session",
        {"sessionId", "startedAt", "completedAt", "publishedAt", "status", "score", "feedback", "learner", "environment", "goals", "performanceMetrics"}),
    ExpandLearner = Table.ExpandRecordColumn(Expanded, "learner",
        {"id", "name", "email"}, {"learner.id", "learner.name", "learner.email"}),
    ExpandEnvironment = Table.ExpandRecordColumn(ExpandLearner, "environment",
        {"id", "name"}, {"environment.id", "environment.name"}),
    Typed = Table.TransformColumnTypes(ExpandEnvironment, {
        {"sessionId", type text},
        {"startedAt", type datetimezone},
        {"completedAt", type datetimezone},
        {"publishedAt", type datetimezone},
        {"score", type number}
    }),

    // Redundant with the server-side window, but makes each partition provably
    // pure — Power BI's incremental refresh expects rows filtered by
    // RangeStart/RangeEnd, and this guarantees it locally too.
    Partitioned = Table.SelectRows(Typed, each
        [publishedAt] >= DateTime.AddZone(RangeStart, 0) and
        [publishedAt] <  DateTime.AddZone(RangeEnd, 0))
in
    Partitioned

Query 2 — `Sessions` (session grain)

let
    Source = SessionsRaw,
    NoLists = Table.RemoveColumns(Source, {"goals", "performanceMetrics"})
in
    NoLists

Query 3 — `SessionGoals` (goal grain)

One row per goal per session. Relate to Sessions on sessionId (many-to-one) for a star schema.

let
    Source = Table.SelectColumns(SessionsRaw, {"sessionId", "goals"}),
    PerGoal = Table.ExpandListColumn(Source, "goals"),
    Expanded = Table.ExpandRecordColumn(PerGoal, "goals",
        {"name", "status", "explanation"}, {"goal.name", "goal.status", "goal.explanation"})
in
    Expanded

goal.status carries the canonical tri-state outcome: met / partiallyMet / notMet.

Query 4 — `SessionPerformanceMetrics` (metric grain)

One row per performance metric per session — the “Performance Metrics” section of the native session report. Relate to Sessions on sessionId (many-to-one), same as SessionGoals.

let
    Source = Table.SelectColumns(SessionsRaw, {"sessionId", "performanceMetrics"}),
    PerMetric = Table.ExpandListColumn(Source, "performanceMetrics"),
    Expanded = Table.ExpandRecordColumn(PerMetric, "performanceMetrics",
        {"name", "status", "explanation"}, {"metric.name", "metric.status", "metric.explanation"})
in
    Expanded

metric.status uses the same tri-state outcomes as goals. Sessions analyzed before performance metrics existed have an empty performanceMetrics list; Table.ExpandListColumn turns each empty list into a single row with nulls, so filter out rows with a null metric.name if you need a strict metric grain.

Incremental refresh: configuration

With RangeStart/RangeEnd wired into Query 1, configure the policy:

In the report view, right-click the Sessions table → Incremental refresh.
Archive data starting: e.g. 2 years before refresh date (size to how much history you want in the model).
Incrementally refresh data starting: e.g. 14 days before refresh date. Leave “Get the latest data in real time (DirectQuery)” and “Only refresh complete days” off.
Repeat for SessionGoals and SessionPerformanceMetrics (both inherit the window through SessionsRaw).
Publish to the service. The first scheduled refresh builds all historical partitions — that is your backfill; no separate full pull is needed.

Why this recipe cannot lose a session:

The lag case is covered by construction. A session that completes just before a refresh but is analyzed just after it gets publishedAt = analysis time — inside the next refresh’s 14-day window. It simply arrives one refresh later, regardless of when it completed.
Both bounds, exactly one partition. Each partition queries publishedAfter = RangeStart and publishedBefore = RangeEnd — the API’s half-open window matches Power BI’s own >= RangeStart, < RangeEnd partition convention, so a record can never be double-counted across partitions. (A lower bound alone would double-count every overlap.)
Re-deliveries dedupe themselves. Refreshing a partition replaces its rows wholesale, so the at-least-once delivery semantics cost nothing inside the refresh window.

Modeling notes

Group by environment.id / learner.id (stable UUIDs) and label with environment.name / learner.name. Names can change; grouping by name splits history on rename.
Performance metric names have no stable ID — metric.name is a display label in the environment’s language, captured when the session was analyzed. Pivoting by metric.name works cleanly within one language and an unchanged metric setup, but a metric rename or a mixed-locale account splits history, exactly like grouping environments by name would.
One-time business pulls (an ad-hoc export, a fiscal-quarter extract) can use completedAfter/completedBefore on a separate query — business slicing is what the completion-time filters are for. Keep scheduled refresh on the publishedAt recipe above.
The queries use RelativePath/Query against a fixed BaseUrl deliberately — that keeps the data source static, so scheduled refresh in the Power BI service works (string-concatenated URLs get flagged as a dynamic data source and refuse to refresh).

Throttling: what to expect

The API key is subject to an API-gateway usage plan (token bucket: steady requests/sec + burst). The cheapest strategy is not needing retries at all:

pageSize=100 — a few hundred sessions is a handful of requests.
Pace page requests ~1s apart (built into Query 1) — negligible cost in a scheduled refresh.
Let incremental refresh do its job — each scheduled refresh pulls only the refresh-window partitions, not full history.

The retry in Query 1 exists as a safety net and follows these rules: trigger on HTTP 429 only; max 5 attempts per page request (budget resets each page); honor Retry-After when present, else exponential 0.5s → 4s. There is intentionally no jitter — fine for one sequential walker. If you run several queries against the same key in parallel, add random jitter (±50%) to delaySec so retries don’t synchronize.