Performance Comparison: main vs feature/php-8.5-only
Cold Build Times (No Cache)
| Documentation | Files | main Branch | feature Branch | Time Improvement |
|---|---|---|---|---|
| TYPO3 Core Changelog | 3667 | 970s (16.2min) | 47.3s | -95.1% (923s faster) |
| TYPO3 Core API | 957 | 128.6s | 34.9s | -72.9% (94s faster) |
| Rendertest | 98 | 9.1s | 6.4s | -29.7% (2.7s faster) |
Resource Usage Comparison
| Documentation | Branch | CPU Usage | Peak Memory |
|---|---|---|---|
| TYPO3 Core Changelog (3667 files) | main | 101% | 2986 MB |
| feature | 440% | ~1.3 GB (-56%) | |
| TYPO3 Core API (957 files) | main | 91% | 358 MB |
| feature | 391% | ~580 MB (+62%) | |
| Rendertest (98 files) | main | 62% | 98 MB |
| feature | 214% | ~200 MB (+104%) |
CPU Usage >100%: Indicates parallel processing via pcntl_fork across multiple cores.
440% CPU = ~4.4 cores utilized on average.
Memory Trade-off: For extra-large docs (Changelog), feature branch uses ~56% less memory (~1.3 GB vs 3 GB) because forked workers process batches and exit, avoiding memory accumulation. For smaller docs, fork overhead increases total memory but provides significant speed improvements.
Build Time Comparison (seconds)
Cold vs Warm Build Performance (Feature Branch)
Warm builds benefit from Twig template caching, inventory caching, and other optimizations.
| Documentation | Cold Build | Warm Build | Partial Build | Improvement (Warm) |
|---|---|---|---|---|
| TYPO3 Core Changelog (3667 files) | 47.3s | 1.84s | 2.02s | -96% |
| TYPO3 Core API (957 files) | 34.9s | 1.55s | 5.43s | -96% |
| Rendertest (98 files) | 6.4s | 1.78s | 1.83s | -72% |
Note: Warm builds benefit from incremental caching - unchanged documents are skipped entirely. Partial builds (single file modified) show the overhead of dependency checking and minimal re-rendering. The 72-96% improvement comes from skipping unchanged documents in the render phase.
Parallel Processing Comparison (Rendertest - 98 files)
Comparison of different parallelism configurations on the small documentation set.
| Configuration | Cold Build | Warm Build | Memory | CPU Usage |
|---|---|---|---|---|
| main branch (official container) | 9.34s | 8.47s | 98 MB | 62% |
| feature (sequential, --parallel-workers=-1) | 6.83s (-27%) | 5.23s (-38%) | ~200 MB | 214% |
| feature (auto, --parallel-workers=0) | 6.62s (-29%) | 5.05s (-40%) | ~200 MB | 197% |
| feature (16 workers, --parallel-workers=16) | 6.77s (-28%) | 6.17s (-27%) | ~200 MB | 171% |
Key Observations
- 27-29% faster even without parallelism: The feature branch's optimizations (caching patches, PHP 8.5 improvements) provide significant speedup independent of parallel processing.
- Memory trade-off: Feature branch uses ~200 MB total (vs main's 98 MB) due to fork overhead, but gains significant speed.
- Warm caching works: Feature branch warm builds are 23-38% faster than cold; main branch shows modest improvement (8.47s warm vs 9.34s cold).
- Parallelism overhead on small docs: For small documentation sets, forcing 16 workers adds slight overhead. Auto-detection or sequential mode perform best.
Parallelism shines on large docs: The parallel processing benefits are most visible on larger documentation sets (Core API: 76% faster, Core Changelog: 95% faster) where the fork overhead is amortized across many files.
Parallel Processing Comparison (TYPO3 Core API - 957 files)
Comparison of different parallelism configurations on the large documentation set.
| Configuration | Cold Build | Warm Build | Memory | CPU Usage |
|---|---|---|---|---|
| main branch (official container) | 185.5s | 162.6s | 358 MB | 91% |
| feature (sequential, --parallel-workers=-1) | 44.2s (-76%) | 35.5s (-78%) | ~580 MB | 391% |
| feature (auto, --parallel-workers=0) | 47.7s (-74%) | 46.2s (-72%) | ~580 MB | 421% |
| feature (16 workers, --parallel-workers=16) | 53.5s (-71%) | 44.9s (-72%) | ~580 MB | 436% |
Key insight: Sequential mode performs best on Core API docs, suggesting the parallel fork overhead outweighs benefits for ~1000 files. All modes still achieve 71-78% improvement over main branch.
Parallel Processing Comparison (TYPO3 Core Changelog - 3667 files)
Comparison of different parallelism configurations on the extra-large documentation set.
| Configuration | Cold Build | Warm Build | Memory | CPU Usage |
|---|---|---|---|---|
| main branch (official container) | 1179.6s (19.7min) | 917.6s (15.3min) | 2986 MB | 101% |
| feature (sequential, --parallel-workers=-1) | 56.3s (-95%) | 49.9s (-95%) | ~1.3 GB (-56%) | 440% |
| feature (auto, --parallel-workers=0) | 76.6s (-94%) | 72.3s (-92%) | ~1.3 GB (-56%) | 420% |
| feature (16 workers, --parallel-workers=16) | 71.1s (-94%) | 52.5s (-94%) | ~1.3 GB (-56%) | 437% |
Key insight: Sequential mode outperforms auto/16-workers on Changelog docs, achieving 21x speedup. The main branch takes 20 minutes cold / 15 minutes warm while feature branch completes in under 1 minute. Total memory usage drops from ~3 GB to ~1.3 GB.
Upstream Contributions
Performance patches submitted to phpDocumentor/guides monorepo.
Submitted Pull Requests (15 patches in 6 PRs)
| PR | Package | Patches | Status |
|---|---|---|---|
| #1287 | guides |
slugger-anchor-normalizer, twig-template-renderer-globals, pre-node-renderer-factory-cache, twig-environment-cache, render-context-document-cache, document-name-resolver-cache, url-generator-cache, external-reference-resolver-cache (8 patches) | Open |
| #1288 | guides-restructured-text |
inline-parser-lexer-reuse, line-checker-cache, buffer-unindent-cache, inline-lexer-regex-cache (4 patches) | Open |
| #1289 | guides-cli |
guides-cli-container-cache (1 patch) | Open |
| #1290 | guides |
Refactoring: Use file path comparison for DocumentEntryNode (no patch file) | Open |
| #1291 | guides-cli |
guides-cli-symfony8-compat (1 patch) | Open |
| #1292 | guides |
project-node-cache (1 patch) | Open |
Key Optimizations
- O(1) URI scheme lookup: Replaced 5600+ character regex with hash set lookup (~6x faster)
- O(1) document lookup: Direct hash map access for ProjectNode document entries
- Instance caching: Reuse parser instances instead of repeated instantiation
- Result caching: Cache expensive computations (URL resolution, slugger, Twig globals)
- DI container caching: Cache compiled Symfony container for faster startup
Not Submitted Upstream
| Patches | Reason |
|---|---|
| directive-rule-regex-cache, enumerated-list-regex-cache, field-list-regex-cache, grid-table-rule-regex-cache, link-rule-regex-cache, simple-table-rule-regex-cache (6 patches) | Minimal value: Only moves regex patterns to class constants. PHP's PCRE engine already caches compiled patterns internally. |
| timing-run-handler, twig-renderer-profiling (2 patches) | Dev tooling: Profiling/benchmarking infrastructure for render-guides development, not suitable for upstream. |
Note: All upstream PRs target PHP 8.1+ and maintain backwards compatibility. The patches in this repository will be removed once the upstream PRs are merged and released.
Project Performance Changes
PHP 8.5 Upgrade & Modernization
-
Require PHP 8.5 only, drop support for PHP 8.1-8.4Enables access to latest PHP performance improvements and JIT optimizations
-
Apply Rector PHP 8.5 code modernizationsFirst-class callable syntax, constructor property promotion, readonly properties
-
#[\Override] attributesApplied to all overriding methods for better static analysis
Parallel Processing (pcntl_fork)
-
ForkingRenderer for parallel HTML renderingRenders documents in parallel using forked processes, major speed improvement for large docs
-
ParallelParseDirectoryHandlerParses RST files in parallel for faster initial document loading
-
ParallelCompileDocumentsHandlerParallel compilation infrastructure (currently using sequential for toctree compatibility)
-
DocumentNavigationProviderPre-computed prev/next navigation for parallel rendering
Caching Infrastructure
-
Inventory caching (53% render time improvement)Cache interlink inventory lookups to avoid repeated HTTP requests
-
Twig template cachingEnable filesystem caching for compiled Twig templates
-
AST caching for parsed documentsCache parsed document AST to skip re-parsing on warm builds
-
OPcache CLI enabled in DockerEnable PHP OPcache for CLI mode in Docker builds
Incremental Rendering Infrastructure
-
IncrementalBuildCacheTrack document content hashes for change detection
-
ChangeDetector & DirtyPropagatorDetect changed files and propagate dirty state through dependency graph
-
DependencyGraphPass & ExportsCollectorPassBuild dependency graph during compilation for incremental builds
-
IncrementalTypeRendererSkip rendering unchanged documents using cache data
Other Changes
-
Performance benchmark infrastructureDocker-based benchmark scripts for reproducible performance testing
-
ProfilingEventListenerPipeline timing measurement (enable with GUIDES_PROFILING=1)
-
composer-normalize pre-commitAdded composer-normalize to pre-commit workflow
-
PHPUnit 12 compatibilityUpdated tests for PHPUnit 12 compatibility
-
Test stability fixesClear AST cache before test runs to fix flaky tests
Compare Rendered Documentation
Review the rendered output from both branches to verify correctness:
Rendertest (Feature Branch)
98 files rendered with feature/php-8.5-only
Rendertest (Main Branch)
98 files rendered with main branch
Note: TYPO3 Core API rendered documentation (957 files) is not included due to size constraints. The performance benchmarks above demonstrate the improvement on this larger documentation set.
Incremental Build Performance (Feature Branch)
Incremental builds skip unchanged documents, only re-rendering files that have been modified or depend on changed exports. "Warm" = no changes (full cache hit), "Single File" = one document changed, "Cascade" = heavily-referenced document changed (triggers dependent re-renders).
| Documentation | Cold Build | Warm (0 files) |
Single File (1 file) |
Cascade (N dependents) |
|---|---|---|---|---|
| TYPO3 Core Changelog (3668 files) |
172.7s ~1.3 GB |
0.47s (-99.7%) | 0.44s (-99.7%) 1 rendered |
N/A |
| TYPO3 Core API (957 files) |
35.3s ~580 MB |
0.39s (-99%) | 0.40s (-99%) 1 rendered |
0.37s (-99%) 158 rendered |
| Rendertest (94 files) |
6.9s ~200 MB |
0.26s (-96%) | 0.25s (-96%) 1 rendered |
N/A |
Key Insight: Constant Time Incremental Builds
Notice that incremental build times are nearly identical across all project sizes: ~0.4-0.5s for 3668 files vs ~0.25s for 94 files. This demonstrates that incremental builds scale with the number of changed files, not the total project size.
Documents Re-rendered Per Scenario
- Warm (0 files): Nothing re-rendered - full cache hit, all documents skipped
- Single File (1 file): Only the modified document re-rendered (e.g., Index.rst has no dependents)
- Cascade (N files): Modified document + all dependents. E.g., EventDispatcher/Index.rst (156 refs) → 158 files re-rendered
Dependency Cascade
When a document's exports change (anchors, titles, citations), the system automatically re-renders dependent documents. This ensures cross-reference integrity across the documentation.
EventDispatcher/Index.rst (referenced by 156 documents)
triggers re-rendering of 158 files (1 modified + 156 dependents + Index) in 0.37s.
This is still 99% faster than a cold build (35s) while ensuring all cross-references are updated correctly.
Benchmark Scenarios
- Cold: All caches cleared, fresh render from scratch
- Warm: No file changes, full cache hit
- Single File: Modify one document with no/few dependents (e.g., Index.rst)
- Cascade: Modify a heavily-referenced document (e.g., EventDispatcher/Index.rst with 156 dependents)
Benchmark Methodology
| Parameter | Value |
|---|---|
| Test Environment | Docker container (Ubuntu-based) |
| PHP Version | 8.5.1 (feature branch) / 8.1.33 (main branch official container) |
| Runs per Scenario | 3 (avg/min/max recorded) |
| Cold Build | All caches cleared before each run |
| Warm Build | Initial run to populate caches, then 3 measured runs |
| Small Docs | Documentation-rendertest (98 files) |
| Large Docs | TYPO3CMS-Reference-CoreApi (957 files) |
| Extra-Large Docs | TYPO3 Core Changelog (3666 files) |
| CPU Cores | 16 (auto-detected, no container limits) |
Parallel Processing: The feature branch uses pcntl_fork
for parallel rendering, automatically detecting available CPU cores. Your results
may vary depending on available cores. Use --parallel-workers=N to
manually set the worker count.
Environment Variability: Benchmark times are highly dependent on hardware. The times shown above were measured on a dedicated benchmark machine. On different hardware (e.g., WSL2, VMs, or slower CPUs), absolute times may be 2-3× higher while maintaining similar relative improvements between branches.
Reproduce These Benchmarks
Follow these steps to reproduce the benchmark results on your own machine:
Prerequisites
- Docker installed and running
- Git
- At least 16GB RAM for large docs
# Clone the repository git clone https://github.com/CybotTM/render-guides.git cd render-guides git checkout feature/php-8.5-only # Download test documentation ./benchmark/download-test-docs.sh TYPO3CMS-Reference-CoreApi # Large (957 files) ./benchmark/download-test-docs.sh TYPO3-Core-Changelog # Extra-large (3666 files) # === Feature Branch Benchmarks === # Run cold benchmark on small docs (rendertest) ./benchmark/benchmark-docker.sh cold 3 small # Run cold benchmark on large docs (TYPO3 Core API) ./benchmark/benchmark-docker.sh cold 3 large # Run cold benchmark on extra-large docs (TYPO3 Core Changelog) ./benchmark/benchmark-docker.sh cold 3 changelog # Run warm benchmark (Twig caches populated) ./benchmark/benchmark-docker.sh warm 3 small ./benchmark/benchmark-docker.sh warm 3 large ./benchmark/benchmark-docker.sh warm 3 changelog # Run partial/incremental benchmark (modifies Index.rst) ./benchmark/benchmark-docker.sh partial 3 small ./benchmark/benchmark-docker.sh partial 3 large ./benchmark/benchmark-docker.sh partial 3 changelog # === Main Branch Benchmarks (using official container) === # For main branch comparison, use the official TYPO3 render-guides container: docker pull ghcr.io/typo3-documentation/render-guides:latest # Small docs (rendertest) docker run --rm -v $(pwd)/Documentation-rendertest:/project/Documentation \ -v /tmp/main-output:/project/Documentation-GENERATED-temp \ ghcr.io/typo3-documentation/render-guides:latest --no-progress # Large docs (TYPO3 Core API) docker run --rm -v $(pwd)/benchmark/test-docs/TYPO3CMS-Reference-CoreApi/Documentation:/project/Documentation \ -v /tmp/main-output-large:/project/Documentation-GENERATED-temp \ ghcr.io/typo3-documentation/render-guides:latest --no-progress # Extra-large docs (TYPO3 Core Changelog) - warning: ~17 minutes on main branch docker run --rm -v $(pwd)/benchmark/test-docs/TYPO3-Core-Changelog/typo3/sysext/core/Documentation:/project/Documentation \ -v /tmp/main-output-changelog:/project/Documentation-GENERATED-temp \ ghcr.io/typo3-documentation/render-guides:latest --no-progress
Result files: Benchmark results are saved as JSON files in benchmark/results/
with naming convention: {branch}_{scenario}_{docs}_{timestamp}.json
References
- PR #1143 — Original performance optimization proposal and discussion
- Feature Branch — Source code for this implementation
- Upstream PRs — Performance patches submitted to phpDocumentor/guides