Skip to content

Performance Optimization Summary

๐ŸŽฏ Achievement

7-8x Performance Improvement achieved through systematic optimizations!

  • Sequential: 5.12s
  • Optimized: 0.67s
  • Speedup: 7.68x
  • Time saved: 87%

๐Ÿ“Š Final Benchmark Results

Comprehensive Comparison (15 requests)

Method Time Req/s Apps/s Speedup
Asyncio+Executor 0.67s 22.4 2,248 7.68x ๐Ÿฅ‡
Batch-ALL 0.69s 21.7 2,175 7.43x ๐Ÿฅˆ
Batch-5chunk 1.24s 12.1 1,214 4.15x
ThreadPool-10 1.24s 12.1 1,205 4.12x
ThreadPool-5 1.32s 11.4 1,136 3.88x
Sequential 5.12s 2.9 293 1.00x ๐Ÿ“

Smaller Scale (9 requests)

Method Time Apps Speedup
Batch-ALL 0.68s 900 4.98x ๐Ÿš€
Batch-3chunk 1.08s 900 3.14x
Sequential 3.39s 900 1.00x

Larger Scale (25 requests)

Method Time Req/s Speedup
Batch-ALL 1.25s 20.05 7.97x ๐Ÿš€
Batch-5chunk 2.82s 8.85 3.52x
Sequential 9.94s 2.51 1.00x

๐Ÿ”ง Implemented Optimizations

1. Batch Processing Functions โšกโšกโšก (CRITICAL)

Problem: Multiple block_on calls create overhead

# โŒ Before: 15 block_on calls
for req in requests:
    result = fetch_and_parse_list(...)  # Each call: runtime enter/exit

Solution: Single block_on with parallel futures

// โœ… After: 1 block_on call
runtime.block_on(async {
    let futures = requests.iter().map(|req| fetch(...));
    try_join_all(futures).await  // True parallel!
})

Impact: 7-8x faster!

2. Global HTTP Client โšกโšก (HIGH)

Before:

let client = PlayStoreClient::new(timeout)?;  // New client each time

After:

static HTTP_CLIENT: Lazy<PlayStoreClient> = Lazy::new(|| {
    PlayStoreClient::new(30).expect("Failed to create HTTP client")
});

Benefits:

  • TCP connection reuse
  • Connection pooling
  • ~10% performance improvement

3. CPU-Aware Tokio Runtime โšกโšก (HIGH)

Before:

.worker_threads(4)  // Hardcoded

After:

let num_cpus = std::thread::available_parallelism()
    .map(|n| n.get())
    .unwrap_or(4);
let worker_threads = (num_cpus / 2).clamp(2, 8);

Benefits:

  • Adapts to system resources
  • Leaves CPU for Python threads
  • ~5% performance improvement

4. Memory Optimization โšก (Nice-to-have)

String Interning:

  • Without: ~3,870 bytes (15 requests)
  • With: ~630 bytes (15 requests)
  • Savings: 83.7%

BatchRequestBuilder:

from playfast import BatchRequestBuilder

builder = BatchRequestBuilder(
    collection="topselling_free", lang="en", intern_strings=True
)

requests = list(builder.build_list_requests(countries=countries, categories=categories))

๐Ÿ“ˆ Performance Analysis

Key Insights

1. block_on Count is Critical

Sequential (15x block_on):     5.12s
Batch-5chunk (3x block_on):    1.24s (4.1x faster)
Batch-ALL (1x block_on):       0.69s (7.4x faster)

Conclusion: Minimize block_on calls!

2. Batch vs Asyncio: Nearly Identical

Asyncio+Executor: 0.67s
Batch-ALL:        0.69s
Difference:       0.02s (3%)

Why?:

  • Both use Python 3.14t free-threading
  • Both achieve true parallelism
  • Asyncio has slightly less overhead

3. Chunking Trade-off

Batch-ALL (1 call):        0.69s
Batch-5chunk (3 calls):    1.24s
Batch-10chunk (2 calls):   ~0.9s (estimated)

Use chunking when:

  • Memory constraints
  • Need progress updates
  • Better error handling

4. Memory vs Performance

Optimization Performance Impact Memory Savings
Batch functions +700% ๐Ÿš€ -
block_on reduction +700% ๐Ÿš€ -
Global HTTP client +10% Connection pool
String interning < 5% 83.7% ๐Ÿ’พ
Generators 0% Lazy allocation

Priority: Performance > Memory (but we got both!)


๐Ÿ’ก Usage Guide

When to Use What

Small Batches (< 10 requests) - Simple API

# NEW: High-level API - simplest and most intuitive!
from playfast import fetch_category_lists

results = fetch_category_lists(
    countries=["us", "kr", "jp"], categories=["GAME_ACTION", "SOCIAL"], num_results=100
)

Medium Batches (10-50 requests) - Organized Results

# NEW: Get organized results by country and category
from playfast import fetch_top_apps

organized = fetch_top_apps(
    countries=["us", "kr", "jp"], categories=["GAME_ACTION", "SOCIAL"], num_results=100
)

# Easy access
us_games = organized["us"]["GAME_ACTION"]
kr_social = organized["kr"]["SOCIAL"]

Large Batches (50+ requests) - Builder Pattern

# NEW: Use BatchFetcher for defaults and reuse
from playfast import BatchFetcher

fetcher = BatchFetcher(
    lang="en", default_num_results=100, default_collection="topselling_free"
)

# Multiple fetches with shared defaults
batch1 = fetcher.category_lists(
    countries=["us", "kr", "jp"], categories=["GAME_ACTION", "SOCIAL"]
)

batch2 = fetcher.category_lists(
    countries=["de", "gb", "fr"], categories=["PRODUCTIVITY", "ENTERTAINMENT"]
)

Complex Workflows - AsyncIO Integration

# Asyncio for integration with other async code
import asyncio
from playfast import fetch_category_lists


async def complex_workflow():
    # Mix with other async operations
    other_data = await fetch_something_else()

    # Use asyncio.to_thread for batch functions
    rust_results = await asyncio.to_thread(
        fetch_category_lists,
        countries=["us", "kr"],
        categories=["GAME_ACTION"],
        num_results=100,
    )

    return combine(other_data, rust_results)

Advanced: Low-Level API (if needed)

# For advanced users who need fine control
from playfast._core import fetch_and_parse_list_batch
from playfast import BatchRequestBuilder

builder = BatchRequestBuilder()
requests = list(builder.build_list_requests(countries, categories))
results = fetch_and_parse_list_batch(requests)

๐ŸŽฏ Best Practices

DO โœ…

  1. Use high-level batch functions (NEW!)
# โœ… Best: Simple and intuitive
from playfast import fetch_category_lists

results = fetch_category_lists(
    countries=["us", "kr"], categories=["GAME_ACTION"], num_results=100
)
  1. Use organized results for easy access
# โœ… Good: Easy to navigate
from playfast import fetch_top_apps

organized = fetch_top_apps(countries, categories)
us_games = organized["us"]["GAME_ACTION"]
  1. Use BatchFetcher for multiple batches
# โœ… Good: Reuse defaults
from playfast import BatchFetcher

fetcher = BatchFetcher(lang="en", default_num_results=100)
batch1 = fetcher.category_lists(countries1, categories)
batch2 = fetcher.category_lists(countries2, categories)
  1. Use asyncio.to_thread for complex workflows
# โœ… Good: Integrates with other async code
results = await asyncio.to_thread(fetch_category_lists, countries, categories)

DON'T โŒ

  1. Don't access _core directly (unless needed)
# โŒ Bad: Low-level API (harder to use)
from playfast._core import fetch_and_parse_list_batch

requests = build_requests_manually()
results = fetch_and_parse_list_batch(requests)

# โœ… Good: High-level API (easier)
from playfast import fetch_category_lists

results = fetch_category_lists(countries, categories)
  1. Don't call single functions in a loop
# โŒ Bad: Multiple block_on calls
from playfast._core import fetch_and_parse_list

for country in countries:
    result = fetch_and_parse_list(...)  # Slow!

# โœ… Good: Single batch call
from playfast import fetch_category_lists

results = fetch_category_lists(countries, categories)
  1. Don't use executors for simple cases
# โŒ Bad: Unnecessary complexity
with ThreadPoolExecutor() as executor:
    futures = [executor.submit(...) for ...]

# โœ… Good: Just use batch functions
results = fetch_category_lists(countries, categories)

๐Ÿ“š Documentation


๐ŸŽ‰ Summary

What We Achieved

Metric Before After Improvement
Time (15 req) 5.12s 0.67s 87% faster
Throughput 293 apps/s 2,248 apps/s 7.68x
block_on calls 15 1 93% reduction
Memory (10K req) ~5 MB ~0.9 MB 83% reduction
Code complexity Same Same No regression

Key Takeaways

  1. New high-level API is easier ๐Ÿ“

  2. No _core access needed

  3. No manual request building
  4. Organized results available
  5. Example: fetch_category_lists(countries, categories)

  6. Batch processing = 7-8x speedup ๐Ÿš€

  7. Most critical optimization

  8. Simple to use with new API
  9. Works out of the box

  10. block_on count matters most โšก

  11. 15 calls โ†’ 1 call = 7.4x faster

  12. Even chunking (3 calls) = 4.1x faster

  13. Memory optimizations are free ๐Ÿ’พ

  14. Python auto-interns literals

  15. BatchRequestBuilder adds < 5% overhead
  16. 83.7% memory savings

  17. API Levels for different needs ๐ŸŽฏ

  18. High-level: fetch_category_lists() - Easy

  19. Mid-level: BatchRequestBuilder - Flexible
  20. Low-level: _core.* - Advanced control

Future Improvements

  • Automatic batch size optimization
  • Per-request error handling (return Result for each)
  • Request prioritization within batches
  • Streaming batch results
  • Python 3.14t further optimizations

Last Updated: 2025-10-13 Benchmark Environment: Windows, 16-core CPU, Python 3.14t