Start a conversation

ObjectStore on Windows: Small Transactions Become Extremely Slow After Cache Is Filled (Transaction-Boundary <code>VirtualProtect</code> Overhead)

Contents

Overview

On ObjectStore for Windows, “small” read-only transactions can become dramatically slower (reported as up to ~1000× slower) after a large read fills the ObjectStore cache. Engineering analysis determined the slowdown occurs primarily at transaction boundaries (commit/abort), where ObjectStore must re-apply memory protection/invalidation across cached pages using Windows memory protection APIs (for example, VirtualProtect).

The cost scales mainly with cache occupancy / number of cached regions, not the size of the current transaction. This is considered an architectural behavior on Windows rather than a defect addressed by a specific patch in this case.

Solution

Symptoms / How to Recognize the Issue

You may be experiencing this behavior if you see the following sequence:

  1. Small read-only transactions are fast immediately after startup or with an empty/light cache.
  2. A large read (or sequence of reads) populates/fills the ObjectStore cache and becomes faster on subsequent runs (expected caching benefit).
  3. After the cache is full, even very small transactions become orders of magnitude slower (e.g., “up to ~1000× slower”).
  4. Clearing the cache restores the original small-transaction performance.

Reproduced Environment

  • Platform: Windows (reported reproduction on Windows 11, local database and local test)
  • ObjectStore version: ObjectStore Cumulative Update Release 2025.1 Update 0
  • Example configuration used in testing:
    • OS_AS_SIZE=0x400000000
    • OS_CACHE_SIZE=0x78000000

Root Cause (Engineering Analysis)

At the end of every transaction (commit or abort), ObjectStore must preserve consistency by invalidating/protecting cached page mappings. On Windows, this is implemented using OS memory protection mechanisms (for example, VirtualProtect) applied over the cached address space.

Key point: the transaction-end cost scales primarily with cache occupancy (and the number of cached memory regions/pages), not with the amount of data touched by the current “small” transaction. After a large read fills the cache, every subsequent transaction boundary can incur the high scan/protect cost, making small transactions appear extremely slow.

This behavior is considered architectural on Windows and was not treated as a defect requiring a patch in this case.

Troubleshooting / What to Check

  1. Confirm the workload pattern
    • Measure “N small transactions” with an empty/cold cache (baseline).
    • Run a “large read” that populates the cache.
    • Measure the same “N small transactions” again after the cache is full.
    • If performance returns to baseline after clearing cache, it supports this diagnosis.
  2. Validate that the slowdown is at transaction boundaries
    • If your profiling tools allow, look for time concentrated around commit/abort or transaction teardown when cache is full.
  3. Confirm configuration and system context
    • Record current values for OS_CACHE_SIZE and OS_AS_SIZE.
    • Record CPU core count and whether the workload runs across many cores (relevant to Windows TLB flush / IPI overhead during protection changes).

Mitigation Options (Choose Based on Workload Constraints)

1) Reduce cache size (OS_CACHE_SIZE) — configuration change

Why it helps: fewer cached pages/regions means less transaction-end scan/protect work.

Trade-off: large reads benefit less from caching.

Example: set OS_CACHE_SIZE smaller, such as:

OS_CACHE_SIZE=0x10000000  // example value (~256 MB)

Validate: rerun the “small transactions after full-cache” scenario and compare end-to-end time.


2) Reduce address space size (OS_AS_SIZE) — configuration change

Why it helps: reduces the range involved in scan/protect operations.

Trade-off: must remain large enough for your dataset plus overhead.

Example:

OS_AS_SIZE=0x100000000  // example value (4 GB)

Validate: repeat the benchmark and confirm both improved timings and stability (no address-space allocation/mapping failures).


3) Batch multiple reads into fewer transactions — code change

Why it helps: the expensive work occurs once per transaction boundary; fewer transactions means fewer times you pay the cost.

Trade-off: longer transactions may hold locks longer and reduce concurrency.

Example pattern:

OS_BEGIN_TXN(txn, 0, os_transaction::read_only)
  for (int i = 0; i < <count>; i++) {
    processSmallRead(i);
  }
OS_END_TXN(txn)

Validate: compare total time for N reads as N transactions versus fewer batched transactions.


4) Clear cache after large/batch operations — code change

Why it helps: removing accumulated cached regions after a large read prevents subsequent small transactions from paying “full-cache” transaction-end overhead.

Trade-off: subsequent accesses will fetch from server/disk again.

Example:

objectstore::return_all_pages();

Validate: confirm that clearing pages returns small-transaction timing to baseline.


5) Co-locate related objects in the same segment — design/code change

Why it helps: more contiguous allocation can reduce the number of distinct memory regions tracked in cache, lowering transaction-end overhead.

Trade-off: requires allocation strategy changes and may affect load-time performance.

Example pattern:

new(os_segment::of(this), ...) DataElement(...);

Validate: measure performance before/after and confirm application concurrency behavior remains acceptable.


6) Reduce active CPU cores / set CPU affinity — system-level change

Why it helps: on Windows, memory protection changes can trigger cross-core TLB flush behavior; fewer cores can reduce that overhead.

Trade-off: reduces parallelism.

Example approach: set CPU affinity for key ObjectStore processes using Windows Task Manager → DetailsSet affinity, or programmatically.

Validate: compare the same benchmark with default affinity versus restricted affinity.

Cache/Address-Space Metrics Request

A request to expose “cache usage” and “address space usage” metrics (API/tooling) was reviewed and accepted into the product roadmap. No delivery commitment or ETA is available.

Frequently Asked Questions

1. How do I know I’m hitting this specific behavior and not a general performance regression?
Look for the distinctive sequence: small transactions are fast with an empty cache, become extremely slow only after a large read fills the cache, and then return to baseline after clearing the cache (or after reducing cache/address space).
2. Is this a defect fixed in a later ObjectStore patch?
In this scenario, it was identified as architectural behavior related to transaction-end cache mapping/protection work on Windows (for example, VirtualProtect), not a specific defect with a patch referenced in the investigation.
3. Which mitigation is safest to try first if I cannot change application code?
Start with configuration-only changes: reduce OS_CACHE_SIZE and/or reduce OS_AS_SIZE (ensuring it remains large enough for your dataset). Then validate by rerunning the same small/large/small benchmark pattern.
4. What verification steps should I use after changing OS_CACHE_SIZE or OS_AS_SIZE?
Re-run a controlled benchmark: (1) small transactions with cold cache, (2) a large read to fill cache, (3) the same small transactions again. Confirm that step (3) no longer shows extreme slowdown and that the system remains stable (no mapping/allocation failures).
5. If my app is interactive and cannot batch transactions, what should I do?
Favor tuning OS_CACHE_SIZE and OS_AS_SIZE to “as small as possible while still meeting workload needs,” and consider targeted cache clearing after known large operations (if your workflow includes any), plus segment co-location strategies to reduce region fragmentation.
6. Is there an API/tool in the current ObjectStore version to show cache usage and address space usage?
No such API/tool was confirmed as available in the investigated scenario. A request to add these metrics was accepted into the product roadmap, but no ETA is available.
Choose files or drag and drop files
Was this article helpful?
Yes
No
  1. Priyanka Bhotika

  2. Posted

Comments