18 Concurrent OpenAI Calls: Parallel Orchestration in C#

Q: Why use Task.WhenAll instead of Parallel.ForEach for concurrent OpenAI calls?

Task.WhenAll is designed for I/O-bound async work like HTTP calls. Parallel.ForEach is for CPU-bound work and burns thread pool threads while waiting. When you're waiting on 18 external API responses, Task.WhenAll is the right tool.

When a user gets a paid reading on iching.rocks, the system needs to generate a comprehensive, multi-section I Ching interpretation. Not a single paragraph — a full document covering opening observations, hexagram judgments, changing line analyses, Ten Wings commentary, synthesis, and a conclusion. Each section requires its own carefully crafted prompt with injected domain data.

The naive approach would be to call OpenAI 18 times sequentially. At roughly 3.5 seconds per call, that’s over a minute of wall-clock time. The actual approach fires 18 concurrent OpenAI calls using Task.WhenAll, collapsing that to about 4 seconds total. Each request is independently tracked for token usage, encrypted at rest, and wrapped in its own error boundary so a single failure doesn’t take down the entire reading.

Here’s how it works.

The Section Architecture
Prompt Construction: Domain Data Injection
Parallel Orchestration: Firing Concurrent OpenAI Calls
Sending Requests and Tracking Tokens
AES-256 Encrypted Storage
Error Handling and Partial Failure Resilience
The Fire-and-Forget Pattern
Why This Architecture
Frequently Asked Questions

The Section Architecture

Every reading is decomposed into named sections, each mapped to a numeric ID. The section map lives in DBaseMapping.cs:

public Dictionary<string, short> GetSectionMappings()
{
    return new Dictionary<string, short>
    {
        { "openingStatements", 1 },
        { "initialHexagramJudgement", 2 },
        { "initialHexagramImage", 3 },
        { "changingLinesLine1", 4 },
        { "changingLinesLine2", 5 },
        { "changingLinesLine3", 6 },
        { "changingLinesLine4", 7 },
        { "changingLinesLine5", 8 },
        { "changingLinesLine6", 9 },
        { "changingLinesAll", 10 },
        { "tenWingsInsights", 11 },
        { "resultingHexagramJudgment", 12 },
        { "resultingHexagramImage", 13 },
        { "synthesisAndGuidance", 14 },
        { "holdforadditional1", 15 },
        { "holdforadditional2", 16 },
        { "holdforadditional3", 17 },
        { "conclusionText", 18 },
    };
}Code language: C# (cs)

Not every reading uses all 18 slots. The actual section list is dynamic — it depends on which lines are changing (if you’re unfamiliar with how I Ching castings work, Wikipedia’s overview of I Ching divination explains changing lines and hexagram structure). The GetAllOpenAIRequests method in processing.cshtml.cs builds the list at runtime:

private static (List<short> Sections, List<string> JsonRequests) GetAllOpenAIRequests(
    Reading reading, IOpenAIRequestBuilder openaiRequestBuilder)
{
    var lSection = new List<short> { 1, 2, 3 };
    lSection.AddRange(reading.ChangingLines.Select(line => (short)(line + 3)));

    if (reading.ChangingLines.Count == 6)
    {
        lSection.Add(10);
    }

    lSection.AddRange(new short[] { 11, 12, 13, 14, 18 });

    var jsonRequests = lSection
        .Select(section => openaiRequestBuilder.GetOpenAIRequestJSON(reading, section))
        .ToList();

    return (lSection, jsonRequests);
}Code language: C# (cs)

Sections 1–3 are always present: opening statements, initial hexagram judgment, initial hexagram image. Then changing line sections are added dynamically — only the lines that actually changed. If all six lines change, a special “all lines changing” section (10) is included. Sections 11–14 and 18 round out every reading with Ten Wings commentary, resulting hexagram analysis, synthesis, and conclusion. Sections 15–17 are reserved for future expansion.

The result is typically 10–18 concurrent OpenAI calls depending on the reading, each with its own specialized prompt.

Prompt Construction: Domain Data Injection

There’s a 4,000-year-old irony here. The traditional I Ching consultation process — cast coins, look up the resulting hexagram in a reference text, retrieve the relevant line commentaries, then interpret all of that in the context of your question — is structurally identical to what the AI industry now calls Retrieval-Augmented Generation. You generate a query, retrieve domain-specific knowledge based on the result, and feed that context to a model for synthesis. The ancient sages were doing RAG with yarrow stalks and bamboo scrolls. The implementation below just swaps in JSON and Task.WhenAll.

Each prompt isn’t just a text string — it’s a structured JSON object that injects the full domain context of the reading into the request. The OpenAIRequestBuilder constructs these:

public string GetOpenAIRequestJSON(Reading reading, short isection)
{
    string sectionName = _dbMapping.GetSectionMappings()
        .FirstOrDefault(x => x.Value == isection).Key;

    if (string.IsNullOrEmpty(sectionName))
        throw new ArgumentException("Invalid section number");

    var request = new
    {
        userQuestion = reading.QuestionPlainText,
        myData = new
        {
            initialHexagram = new
            {
                hexagramNumber = reading.Hexegram1.hexagram,
                name = reading.Hexegram1.name,
                trigram_above = new
                {
                    font = reading.Hexegram1.trigram_above?.font,
                    chinese = reading.Hexegram1.trigram_above?.chinese,
                    symbolic = reading.Hexegram1.trigram_above?.symbolic,
                    english = reading.Hexegram1.trigram_above?.english
                },
                trigram_below = new
                {
                    font = reading.Hexegram1.trigram_below?.font,
                    chinese = reading.Hexegram1.trigram_below?.chinese,
                    symbolic = reading.Hexegram1.trigram_below?.symbolic,
                    english = reading.Hexegram1.trigram_below?.english
                }
            },
            changingLines = reading.ChangingLines?.Select(cl => new
            {
                lineNumber = cl,
                isLastChangingLine = cl == reading.ChangingLines.LastOrDefault()
            }).ToArray(),
            numberOfChangingLines = reading.ChangingLines.Count,
            resultingHexagram = new
            {
                hexagramNumber = reading.Hexegram2.hexagram,
                name = reading.Hexegram2.name,
                trigram_above = new { /* same structure */ },
                trigram_below = new { /* same structure */ }
            }
        },
        openaiReturn = GetSectionPrompt(sectionName, reading.HasChangingLines)
    };

    return JsonSerializer.Serialize(request, new JsonSerializerOptions { WriteIndented = true });
}Code language: C# (cs)

Every request gets the same domain data envelope — the user’s question, both hexagrams with their trigram details, the changing lines with positional metadata — but a different openaiReturn prompt tailored to that specific section. The domain data gives the model everything it needs to interpret the reading. The section prompt tells it what aspect to focus on and how to format the output.

The section prompts themselves are extensive. They specify paragraph counts, word ranges, formatting requirements, transition language, and the exact JSON structure the response must follow. Here’s a trimmed example for the Ten Wings section:

private string GetSectionPrompt(string sectionName, bool hasChangingLines)
{
    var prompts = hasChangingLines
        ? new Dictionary<string, string>
        {
            { "openingStatements", "Create a warm and engaging introduction that reflects " +
              "the user's query and thoughtfully sets the stage for a deeper interpretation. " +
              "Present the main themes of the initial hexagram... Aim for a minimum of 4 to 5 " +
              "paragraphs... Generate the response in JSON format using only the following " +
              "HTML tags: <p>, <b>, and <em>..." },
            { "tenWingsInsights", "Provide an in-depth exploration of the Ten Wings " +
              "commentary for this I Ching reading, covering each section in a cohesive " +
              "narrative: (1) Commentary on the Judgment (Tuan Zhuan)... (2) Commentary " +
              "on the Image (Xiang Zhuan)... (3) Explanation of the Trigrams (Shuo Gua)... " +
              "(4) Miscellaneous Notes (Za Gua)... (5) Appended Sayings (Da Zhuan)... " +
              "Structure the response in 5-7 paragraphs, each containing 150-300 words..." },
            // ... remaining sections
        }
        : new Dictionary<string, string>
        {
            // Separate prompt set for unchanging hexagrams
            { "openingStatements", "Create a warm and engaging introduction... " +
              "immediately clarifies that this reading involves an unchanging hexagram, " +
              "offering a stable and enduring perspective..." },
            // ...
        };

    return prompts.TryGetValue(sectionName, out string prompt) ? prompt : "";
}Code language: C# (cs)

There’s a critical design decision here: the prompt dictionary branches on hasChangingLines. Readings with changing lines get one set of prompts; unchanging readings get a completely different set. The unchanging prompts are longer and more detailed per section because there are fewer sections to fill — no changing line analysis, no resulting hexagram. The model needs to go deeper on fewer topics to produce a reading of comparable depth.

Each prompt also specifies an exact JSON response structure — { "response": ["<p>...</p>", "<p>...</p>"] } — which allows the application to parse and render the output directly as HTML without additional transformation.

Parallel Orchestration: Firing Concurrent OpenAI Calls

This is the core of the system. Once all the request JSON strings are built, they fire concurrently:

private static async Task<(int succeeded, int failed)> ProcessOpenAIRequestAsync(
    Reading r,
    IDBHandler db,
    IOpenAIRequestBuilder openaiRequestBuilder,
    IEncryptionHelper encryptionHelper,
    IOptions<ApplicationSettings> settings,
    HttpClient client,
    ILogger logger)
{
    var (sections, lRequestJson) = GetAllOpenAIRequests(r, openaiRequestBuilder);

    int iSucceeded = 0;
    int iFailed = 0;
    var tasks = new List<Task>();

    for (int i = 0; i < lRequestJson.Count; i++)
    {
        int idx = i; // capture for closure
        tasks.Add(Task.Run(async () =>
        {
            try
            {
                await SendOpenAIRequestAsync(
                    lRequestJson[idx], sections[idx], r.ID,
                    db, encryptionHelper, settings, client, logger);
                Interlocked.Increment(ref iSucceeded);
            }
            catch (Exception ex)
            {
                logger.LogError(ex, "Section {Section} failed for reading {ReadingId}",
                    sections[idx], r.ID);
                Interlocked.Increment(ref iFailed);
            }
        }));
    }

    await Task.WhenAll(tasks);
    logger.LogDebug($"All sections processed for reading {r.ID} — " +
                    $"{iSucceeded} succeeded, {iFailed} failed");
    return (iSucceeded, iFailed);
}Code language: C# (cs)

A few things to notice here.

The int idx = i closure capture is essential. Without it, the lambda captures the loop variable i by reference, and by the time the tasks execute, i has already advanced. Classic async closure bug — easy to miss, hard to debug in production.

Interlocked.Increment handles thread-safe counting. Multiple tasks complete concurrently on different thread pool threads, so a plain iSucceeded++ would race. Interlocked.Increment provides an atomic increment without needing a lock.

Each task wraps its own try/catch. This is the key to partial failure resilience — if section 6 (changing line 3) throws an HttpRequestException because OpenAI returned a 500, sections 1–5 and 7–18 are unaffected. The method returns a tuple of succeeded and failed counts so the caller can decide how to handle partial completions.

The method itself is static and takes all dependencies as parameters. This was a deliberate refactoring — the original version captured PageModel instance fields in the closure, which meant the background task was holding references to request-scoped DI services that could be disposed after the HTTP response returned. Making it static with explicit parameters eliminates that class of bug entirely.

Sending Requests and Tracking Tokens

Each individual request follows a straightforward pattern — construct the HTTP request, call OpenAI, parse the response, encrypt both sides, extract token counts, and persist everything:

private static async Task<string> SendOpenAIRequestAsync(
    string jsonRequest, short isection, long rid,
    IDBHandler db, IEncryptionHelper encryptionHelper,
    IOptions<ApplicationSettings> settings, HttpClient client, ILogger logger)
{
    var apiKey = settings.Value.OpenAISettings.ApiKey;
    var apimodel = settings.Value.OpenAISettings.OpenApiModel;
    var openaiendpoint = settings.Value.OpenAISettings.ChatEndPoint;

    var requestData = new
    {
        model = apimodel,
        messages = new[] { new { role = "user", content = jsonRequest } },
        max_tokens = 4096
    };

    string jsonRequestData = JsonSerializer.Serialize(requestData);

    var requestMessage = new HttpRequestMessage(HttpMethod.Post, openaiendpoint);
    requestMessage.Headers.Authorization =
        new AuthenticationHeaderValue("Bearer", apiKey);
    requestMessage.Content =
        new StringContent(jsonRequestData, Encoding.UTF8, "application/json");

    var response = await client.SendAsync(requestMessage);
    var fullresponse = await response.Content.ReadAsStringAsync();

    if (response.IsSuccessStatusCode)
    {
        // Encrypt request and response for storage
        var eRequest = encryptionHelper.EncryptToBytes(jsonRequest);
        var eResponse = encryptionHelper.EncryptToBytes(fullresponse);

        // Parse token usage from the response
        var jsonResponse = JsonDocument.Parse(fullresponse);
        var usage = jsonResponse.RootElement.GetProperty("usage");
        int promptTokens = usage.GetProperty("prompt_tokens").GetInt32();
        int completionTokens = usage.GetProperty("completion_tokens").GetInt32();
        int totalTokens = usage.GetProperty("total_tokens").GetInt32();

        // Persist encrypted data with token metrics
        await db.OpenAIInsertReadingPartAsync(
            rid, eRequest, eResponse,
            promptTokens, completionTokens, totalTokens, isection);

        return fullresponse;
    }

    throw new HttpRequestException(
        $"OpenAI API Error: {response.StatusCode} - {fullresponse}");
}Code language: C# (cs)

The authorization header is set per-request via HttpRequestMessage.Headers.Authorization rather than on HttpClient.DefaultRequestHeaders. This is important — DefaultRequestHeaders is a shared mutable collection on HttpClient, and mutating it from 18 concurrent tasks creates a race condition. Setting headers per-request is inherently thread-safe.

Token usage is extracted directly from the OpenAI response JSON using JsonDocument — prompt tokens, completion tokens, and total tokens for every single section. This gives granular cost visibility down to the individual section level.

The database persistence uses Dapper with a stored procedure. Each call opens its own SqlConnection, making concurrent inserts safe:

public async Task OpenAIInsertReadingPartAsync(
    long rid, byte[] request, byte[] response,
    int prompttoken, int completiontokens, int totaltokens,
    short iChingSection)
{
    using IDbConnection dbConn = new SqlConnection(
        _configuration.GetConnectionString("dbConn"));

    var parameters = new DynamicParameters();
    parameters.Add("@RID", rid, DbType.Int64);
    parameters.Add("@Request", request, DbType.Binary);
    parameters.Add("@Response", response, DbType.Binary);
    parameters.Add("@PromptTokens", prompttoken, DbType.Int32);
    parameters.Add("@CompletionTokens", completiontokens, DbType.Int32);
    parameters.Add("@TotalTokens", totaltokens, DbType.Int32);
    parameters.Add("@IChingSection", iChingSection, DbType.Int16);

    await dbConn.ExecuteAsync("[dbo].[OpenAIInsertReadingPartByRID]",
        parameters, commandType: CommandType.StoredProcedure);
}Code language: C# (cs)

The Request and Response columns are varbinary(max) — they store the AES-encrypted byte arrays directly. No Base64 encoding overhead for storage, no string conversion. The token counts sit alongside as plain integers for easy aggregation in cost monitoring queries.

The admin panel surfaces this data in a per-reading detail view that shows every section’s timing, token counts, and throughput:

<table class="table table-sm table-striped">
    <thead>
        <tr>
            <th>#</th>
            <th>Section</th>
            <th>Delta</th>
            <th>Prompt Tk</th>
            <th>Compl Tk</th>
            <th>Total Tk</th>
            <th>Tk/sec</th>
            <th>Model</th>
        </tr>
    </thead>
</table>Code language: HTML, XML (xml)

This lets me see not just how much a reading cost, but how each section performed — which prompts are expensive, which ones are slow, and where optimization effort would have the most impact.

AES-256 Encrypted Storage

Every OpenAI request and response is encrypted before it hits the database. The encryption layer uses AES-256-CBC with per-encryption IV generation:

public class EncryptionHelper : IEncryptionHelper
{
    private readonly byte[] Key;
    private readonly byte[] IV;

    public EncryptionHelper(IOptions<ApplicationSettings> settings)
    {
        Key = GetValidKey(settings.Value.EncryptionSettings.Key);
        IV = GetValidIV(settings.Value.EncryptionSettings.IV);
    }

    private byte[] GetValidKey(string key)
    {
        using (SHA256 sha256 = SHA256.Create())
        {
            return sha256.ComputeHash(Encoding.UTF8.GetBytes(key));
        }
    }
}Code language: C# (cs)

The key derivation uses SHA-256 to hash the configured key string into a 256-bit (32-byte) key. This ensures the AES key is always the correct length regardless of what string is configured in appsettings.json. SHA-256 output is deterministic and fixed-length — the same configuration key always produces the same AES key. In a greenfield build I’d reach for PBKDF2 here — it runs the hash thousands of times in a loop, which adds brute-force resistance that a single-pass SHA-256 doesn’t provide. Pragmatically, with a high-entropy key string the risk is theoretical, but it’s worth noting.

The critical security property is in EncryptToBytes — every encryption generates a fresh IV:

public byte[] EncryptToBytes(string plainText)
{
    if (string.IsNullOrEmpty(plainText))
        throw new ArgumentNullException(nameof(plainText));

    using (Aes aesAlg = Aes.Create())
    {
        aesAlg.Key = Key;
        aesAlg.GenerateIV();          // Fresh IV per encryption
        var iv = aesAlg.IV;

        ICryptoTransform encryptor = aesAlg.CreateEncryptor(aesAlg.Key, iv);

        using (MemoryStream msEncrypt = new MemoryStream())
        {
            msEncrypt.Write(iv, 0, iv.Length);  // Prepend IV to ciphertext
            using (CryptoStream csEncrypt = new CryptoStream(
                msEncrypt, encryptor, CryptoStreamMode.Write))
            using (StreamWriter swEncrypt = new StreamWriter(csEncrypt))
            {
                swEncrypt.Write(plainText);
            }
            return msEncrypt.ToArray();
        }
    }
}Code language: C# (cs)

aesAlg.GenerateIV() produces a cryptographically random 16-byte IV for every single encryption operation. The IV is prepended to the ciphertext in the output byte array — first 16 bytes are the IV, the rest is the encrypted data. This means that even if the same prompt is used for two different readings, the ciphertext will be completely different.

Decryption reverses the process — extract the IV from the first 16 bytes, use it with the key to decrypt the remainder:

public string DecryptFromBytes(byte[] cipherBytes)
{
    if (cipherBytes == null || cipherBytes.Length <= 16)
        throw new ArgumentException("The cipher text is invalid or corrupted.");

    using (Aes aesAlg = Aes.Create())
    {
        aesAlg.Key = Key;
        aesAlg.IV = cipherBytes.Take(16).ToArray();  // Extract prepended IV

        ICryptoTransform decryptor = aesAlg.CreateDecryptor(aesAlg.Key, aesAlg.IV);

        using (MemoryStream msDecrypt = new MemoryStream(cipherBytes.Skip(16).ToArray()))
        using (CryptoStream csDecrypt = new CryptoStream(
            msDecrypt, decryptor, CryptoStreamMode.Read))
        using (StreamReader srDecrypt = new StreamReader(csDecrypt))
        {
            return srDecrypt.ReadToEnd();
        }
    }
}Code language: C# (cs)

There’s also a size-aware truncation utility that accounts for AES overhead when the encrypted output needs to fit within database column constraints:

public static string TruncateForEncryption(string plainText, int maxEncryptedBytes = 8000)
{
    const int AesOverhead = 32; // 16 IV + 16 max padding
    int maxPlaintextBytes = maxEncryptedBytes - AesOverhead;

    var bytes = Encoding.UTF8.GetBytes(plainText);
    if (bytes.Length <= maxPlaintextBytes)
        return plainText;

    var truncated = Encoding.UTF8.GetString(bytes, 0, maxPlaintextBytes);
    return truncated.TrimEnd('\uFFFD'); // Avoid splitting multi-byte UTF-8 sequences
}Code language: C# (cs)

AES-CBC adds up to 16 bytes of PKCS7 padding plus the 16-byte prepended IV. The TruncateForEncryption method accounts for both, and handles the edge case where truncating at a byte boundary might split a multi-byte UTF-8 character (the \uFFFD replacement character trim).

Error Handling and Partial Failure Resilience

When you fire 18 concurrent OpenAI calls to an external API, some will occasionally fail. The architecture handles this at multiple levels.

At the individual request level, SendOpenAIRequestAsync throws on non-success HTTP status codes. At the orchestration level, each task’s try/catch catches that exception, logs it, and increments the failure counter without affecting other tasks.

The caller — the background Task.Run in OnGetAsync — uses the success/failure counts to decide whether the reading is complete:

_ = Task.Run(async () => {
    using var scope = scopeFactory.CreateScope();
    var scopedDb = scope.ServiceProvider.GetRequiredService<IDBHandler>();

    try
    {
        var (iSucceeded, iFailed) = await ProcessOpenAIRequestAsync(
            reading, scopedDb, openaiRequestBuilder,
            encryptionHelper, settings, client, logger);

        if (iFailed == 0)
        {
            await scopedDb.PaidUpdatePaidReadingByID(reading.ID);
            logger.LogDebug($"Marked reading {reading.ID} as complete");
        }
        else
        {
            logger.LogWarning(
                $"Reading {reading.ID}: {iFailed} of {iSucceeded + iFailed} " +
                "sections failed. Reading NOT marked complete.");
        }
    }
    catch (Exception ex)
    {
        logger.LogError(ex, $"OpenAI processing failed for reading {reading.ID}");
    }
});Code language: C# (cs)

Only readings where every section succeeds are marked complete. If any section fails, the reading stays in a processing state where it can be retried or investigated. This is a deliberate choice — a partial reading with missing sections would be confusing to the user. Better to fail cleanly than to deliver an incomplete interpretation.

The Fire-and-Forget Pattern

The entire OpenAI processing pipeline runs in a fire-and-forget Task.Run that’s decoupled from the HTTP request lifecycle. The Razor Page returns immediately with a processing UI, while the background task handles the actual API calls.

This creates a DI scope management challenge. The IDBHandler is registered as a scoped service — its lifetime is tied to the HTTP request. Once OnGetAsync returns, that scope is disposed. The background task would be using a dead service instance.

The fix is to create a dedicated DI scope for the background task:

var scopeFactory = _scopeFactory;  // Capture IServiceScopeFactory before closure

_ = Task.Run(async () => {
    using var scope = scopeFactory.CreateScope();
    var scopedDb = scope.ServiceProvider.GetRequiredService<IDBHandler>();
    // ... use scopedDb, not _db
});Code language: C# (cs)

IServiceScopeFactory is a singleton — safe to capture in a closure that outlives the request. The using var scope ensures the new scope (and its resolved services) are properly disposed when the background task completes.

The remaining dependencies captured in the closure — HttpClient (from IHttpClientFactory, thread-safe), ILogger (thread-safe by design), IEncryptionHelper (stateless after construction), IOptions<ApplicationSettings> (singleton) — are all safe to use across scope boundaries.

Why This Architecture

The parallel orchestration exists because the alternative is unacceptable. Sequential execution of 18 OpenAI calls at ~3.5 seconds each produces a 63-second wait. I know this because it happened — during a code review refactoring, the parallel execution was accidentally serialized into a for loop. Processing time jumped from about 4 seconds to over a minute. The fix was immediate: restore Task.WhenAll with per-task error boundaries.

The encryption exists because the system stores user questions alongside AI-generated interpretations. People ask the I Ching personal questions — about relationships, career decisions, health concerns. That data deserves encryption at rest regardless of what the regulatory environment requires.

The per-request token tracking exists because OpenAI API costs are the primary operational expense. Knowing that the Ten Wings section averages 1,800 completion tokens while the conclusion averages 600 lets me tune prompts with cost awareness. The admin panel’s per-section breakdown turns token tracking from an afterthought into a first-class operational metric.

The partial failure resilience exists because 18 concurrent external API calls will eventually produce partial failures. When they do, the system logs exactly which sections failed, preserves everything that succeeded, and leaves the reading in a state that can be investigated and retried — rather than losing all 17 successful responses because one request timed out.

These aren’t academic architecture decisions. Every one of them was validated by a production incident or discovered during a multi-agent code review pipeline where Claude and Codex audited the codebase against each other. The system is better for it.

Frequently Asked Questions

Why use Task.WhenAll instead of Parallel.ForEach for concurrent OpenAI calls?

Task.WhenAll is designed for I/O-bound async work like HTTP calls. Parallel.ForEach is for CPU-bound work and burns thread pool threads while waiting. When you’re waiting on 18 external API responses, Task.WhenAll is the right tool.

How do you handle OpenAI rate limits with 18 concurrent requests?

OpenAI’s rate limits are per-minute token and request quotas, not per-second. Eighteen simultaneous requests from a single reading is well within standard tier limits. If you were processing multiple readings concurrently, you’d want a SemaphoreSlim to throttle.

Why encrypt OpenAI requests and responses at rest?

Users submit personal questions — about relationships, health, career decisions. Storing those questions and their AI-generated interpretations in plaintext would expose sensitive data if the database were ever compromised.

Why not use OpenAI’s Batch API instead of concurrent calls?

The Batch API is designed for bulk offline processing with a 24-hour turnaround. A user waiting for their reading needs results in seconds, not hours. Concurrent real-time calls are the only option here.

What happens if one of the 18 OpenAI calls fails?

Each call is wrapped in its own error boundary. Failed sections are logged and counted, successful sections are persisted normally. The reading is only marked complete if all sections succeed.

Firing 18 Concurrent OpenAI Calls: The Parallel Orchestration Behind iching.rocks Interpretations

In this Article