2月17日 22:19

What are the techniques and best practices for Python performance optimization?

Performance Analysis Tools

1. timeit Module

The timeit module is used to measure the execution time of small code snippets.

python
import timeit # Measure code execution time code = """ sum(range(1000)) """ execution_time = timeit.timeit(code, number=1000) print(f"Execution time: {execution_time:.4f} seconds") # Use timeit decorator @timeit.timeit def test_function(): return sum(range(1000)) test_function()

2. cProfile Module

The cProfile module is used to analyze program performance bottlenecks.

python
import cProfile def slow_function(): total = 0 for i in range(1000000): total += i return total def fast_function(): return sum(range(1000000)) def main(): slow_function() fast_function() # Performance profiling cProfile.run('main()') # Output analysis results to file cProfile.run('main()', filename='profile_stats')

3. memory_profiler

memory_profiler is used to analyze memory usage.

python
# Install: pip install memory-profiler from memory_profiler import profile @profile def memory_intensive_function(): data = [i for i in range(1000000)] return sum(data) if __name__ == '__main__': memory_intensive_function()

4. line_profiler

line_profiler is used to analyze function performance line by line.

python
# Install: pip install line_profiler from line_profiler import LineProfiler def complex_function(): result = [] for i in range(1000): result.append(i * 2) return sum(result) # Create performance profiler lp = LineProfiler() lp_wrapper = lp(complex_function) lp_wrapper() # Display results lp.print_stats()

Algorithm Optimization

1. Choose Appropriate Algorithms

python
# Bad practice - O(n²) complexity def find_duplicates_slow(arr): duplicates = [] for i in range(len(arr)): for j in range(i + 1, len(arr)): if arr[i] == arr[j] and arr[i] not in duplicates: duplicates.append(arr[i]) return duplicates # Good practice - O(n) complexity def find_duplicates_fast(arr): seen = set() duplicates = set() for item in arr: if item in seen: duplicates.add(item) else: seen.add(item) return list(duplicates)

2. Use Built-in Functions

python
# Bad practice - Manual implementation def manual_sum(arr): total = 0 for item in arr: total += item return total # Good practice - Use built-in functions def builtin_sum(arr): return sum(arr) # Performance comparison import timeit print(timeit.timeit(lambda: manual_sum(range(10000)), number=100)) print(timeit.timeit(lambda: builtin_sum(range(10000)), number=100))

3. Avoid Unnecessary Computations

python
# Bad practice - Repeated computation def calculate_distances(points): distances = [] for i in range(len(points)): for j in range(len(points)): dx = points[j][0] - points[i][0] dy = points[j][1] - points[i][1] distances.append((dx ** 2 + dy ** 2) ** 0.5) return distances # Good practice - Avoid repeated computation def calculate_distances_optimized(points): distances = [] for i in range(len(points)): for j in range(i + 1, len(points)): dx = points[j][0] - points[i][0] dy = points[j][1] - points[i][1] distances.append((dx ** 2 + dy ** 2) ** 0.5) return distances

Data Structure Optimization

1. Use Appropriate Data Structures

python
# List lookup - O(n) def find_in_list(lst, target): return target in lst # Set lookup - O(1) def find_in_set(s, target): return target in s # Performance comparison import timeit lst = list(range(10000)) s = set(range(10000)) print("List lookup:", timeit.timeit(lambda: find_in_list(lst, 5000), number=1000)) print("Set lookup:", timeit.timeit(lambda: find_in_set(s, 5000), number=1000))

2. Use Generators Instead of Lists

python
# Bad practice - Use lists def get_squares_list(n): return [i ** 2 for i in range(n)] # Good practice - Use generators def get_squares_generator(n): for i in range(n): yield i ** 2 # Memory usage comparison import sys list_obj = get_squares_list(1000000) gen_obj = get_squares_generator(1000000) print(f"List memory: {sys.getsizeof(list_obj)} bytes") print(f"Generator memory: {sys.getsizeof(gen_obj)} bytes")

3. Use slots to Reduce Memory

python
class Person: def __init__(self, name, age): self.name = name self.age = age class PersonWithSlots: __slots__ = ['name', 'age'] def __init__(self, name, age): self.name = name self.age = age # Memory comparison import sys p1 = Person("Alice", 25) p2 = PersonWithSlots("Alice", 25) print(f"Regular object: {sys.getsizeof(p1)} bytes") print(f"With __slots__: {sys.getsizeof(p2)} bytes")

I/O Optimization

1. Batch Process I/O

python
# Bad practice - Write line by line def write_lines_slow(filename, lines): with open(filename, 'w') as f: for line in lines: f.write(line + '\n') # Good practice - Batch write def write_lines_fast(filename, lines): with open(filename, 'w') as f: f.write('\n'.join(lines))

2. Use Buffering

python
# Bad practice - No buffering def read_without_buffer(filename): with open(filename, 'r', buffering=0) as f: return f.read() # Good practice - Use buffering def read_with_buffer(filename): with open(filename, 'r', buffering=8192) as f: return f.read()

3. Asynchronous I/O

python
import asyncio import aiohttp async def fetch_url(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.text() async def fetch_all_urls(urls): tasks = [fetch_url(url) for url in urls] return await asyncio.gather(*tasks) urls = [ "https://www.example.com", "https://www.google.com", "https://www.github.com", ] # Fetch all URLs asynchronously results = asyncio.run(fetch_all_urls(urls))

Concurrency Optimization

1. Multiprocessing for CPU-Intensive Tasks

python
import multiprocessing def process_data(data_chunk): return sum(x ** 2 for x in data_chunk) def parallel_processing(data, num_processes=4): chunk_size = len(data) // num_processes chunks = [data[i:i + chunk_size] for i in range(0, len(data), chunk_size)] with multiprocessing.Pool(processes=num_processes) as pool: results = pool.map(process_data, chunks) return sum(results) data = list(range(1000000)) result = parallel_processing(data)

2. Multithreading for I/O-Intensive Tasks

python
import threading import requests def download_url(url): response = requests.get(url) return len(response.content) def parallel_download(urls): threads = [] results = [] def worker(url): result = download_url(url) results.append(result) for url in urls: thread = threading.Thread(target=worker, args=(url,)) threads.append(thread) thread.start() for thread in threads: thread.join() return results urls = ["url1", "url2", "url3"] results = parallel_download(urls)

3. Use concurrent.futures

python
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor def process_item(item): return item ** 2 def with_thread_pool(items): with ThreadPoolExecutor(max_workers=4) as executor: results = list(executor.map(process_item, items)) return results def with_process_pool(items): with ProcessPoolExecutor(max_workers=4) as executor: results = list(executor.map(process_item, items)) return results items = list(range(1000)) thread_results = with_thread_pool(items) process_results = with_process_pool(items)

Caching Optimization

1. Use functools.lru_cache

python
from functools import lru_cache @lru_cache(maxsize=128) def fibonacci(n): if n < 2: return n return fibonacci(n-1) + fibonacci(n-2) # Fast calculation print(fibonacci(100))

2. Custom Caching

python
class Cache: def __init__(self, max_size=128): self.cache = {} self.max_size = max_size def get(self, key): return self.cache.get(key) def set(self, key, value): if len(self.cache) >= self.max_size: self.cache.pop(next(iter(self.cache))) self.cache[key] = value cache = Cache() def expensive_computation(x): cached_result = cache.get(x) if cached_result is not None: return cached_result result = sum(i ** 2 for i in range(x)) cache.set(x, result) return result

3. Use Redis Cache

python
import redis import pickle # Connect to Redis r = redis.Redis(host='localhost', port=6379, db=0) def cache_result(key, value, ttl=3600): """Cache result""" r.setex(key, ttl, pickle.dumps(value)) def get_cached_result(key): """Get cached result""" result = r.get(key) if result: return pickle.loads(result) return None def expensive_operation(data): cache_key = f"result:{hash(str(data))}" # Try to get from cache cached = get_cached_result(cache_key) if cached: return cached # Execute computation result = complex_computation(data) # Cache result cache_result(cache_key, result) return result

String Optimization

1. Use join Instead of +

python
# Bad practice - Use + def build_string_slow(parts): result = "" for part in parts: result += part return result # Good practice - Use join def build_string_fast(parts): return ''.join(parts) # Performance comparison import timeit parts = ["part"] * 1000 print(timeit.timeit(lambda: build_string_slow(parts), number=100)) print(timeit.timeit(lambda: build_string_fast(parts), number=100))

2. Use String Formatting

python
# Bad practice - String concatenation def format_message_slow(name, age): return "Name: " + name + ", Age: " + str(age) # Good practice - Use f-string def format_message_fast(name, age): return f"Name: {name}, Age: {age}" # Performance comparison print(timeit.timeit(lambda: format_message_slow("Alice", 25), number=10000)) print(timeit.timeit(lambda: format_message_fast("Alice", 25), number=10000))

3. Use String Methods

python
# Bad practice - Manual processing def process_string_slow(s): result = "" for char in s: if char.isupper(): result += char.lower() else: result += char return result # Good practice - Use built-in methods def process_string_fast(s): return s.lower() # Performance comparison print(timeit.timeit(lambda: process_string_slow("HELLO"), number=10000)) print(timeit.timeit(lambda: process_string_fast("HELLO"), number=10000))

Database Optimization

1. Use Connection Pool

python
from sqlalchemy import create_engine from sqlalchemy.pool import QueuePool # Create connection pool engine = create_engine( 'postgresql://user:password@localhost/dbname', poolclass=QueuePool, pool_size=10, max_overflow=5 ) def execute_query(query): with engine.connect() as connection: result = connection.execute(query) return result.fetchall()

2. Batch Insert

python
# Bad practice - Insert one by one def insert_slow(items): for item in items: db.execute("INSERT INTO table VALUES (%s)", (item,)) # Good practice - Batch insert def insert_fast(items): db.executemany("INSERT INTO table VALUES (%s)", [(item,) for item in items])

3. Use Indexes

python
# Create index CREATE INDEX idx_name ON users(name); # Use index query SELECT * FROM users WHERE name = 'Alice'; # Avoid full table scan # Bad practice SELECT * FROM users WHERE LOWER(name) = 'alice'; # Good practice SELECT * FROM users WHERE name = 'Alice';

Best Practices

1. Pre-allocate Memory

python
# Bad practice - Dynamic growth def build_list_slow(): result = [] for i in range(10000): result.append(i) return result # Good practice - Pre-allocate def build_list_fast(): return [i for i in range(10000)]

2. Avoid Global Variables

python
# Bad practice - Use global variables counter = 0 def increment_global(): global counter counter += 1 # Good practice - Use local variables def increment_local(counter): return counter + 1

3. Use Appropriate Data Types

python
# Bad practice - Use lists for numeric data numbers = [1, 2, 3, 4, 5] # Good practice - Use arrays import array numbers = array.array('i', [1, 2, 3, 4, 5]) # Bad practice - Use strings for binary data data = "binary data" # Good practice - Use bytes data = b"binary data"

4. Lazy Loading

python
# Bad practice - Load all data immediately def load_all_data(): data = [] for item in large_dataset: processed = process_item(item) data.append(processed) return data # Good practice - Lazy loading def load_data_lazy(): for item in large_dataset: yield process_item(item)

Performance Monitoring

1. Use logging to Record Performance

python
import logging import time logging.basicConfig(level=logging.INFO) def logged_function(func): def wrapper(*args, **kwargs): start_time = time.time() result = func(*args, **kwargs) end_time = time.time() logging.info(f"{func.__name__} execution time: {end_time - start_time:.4f} seconds") return result return wrapper @logged_function def expensive_function(): time.sleep(1) return "Done" expensive_function()

2. Use Performance Counters

python
import time from collections import defaultdict class PerformanceMonitor: def __init__(self): self.counters = defaultdict(list) def record(self, name, duration): self.counters[name].append(duration) def get_stats(self, name): durations = self.counters[name] return { 'count': len(durations), 'total': sum(durations), 'average': sum(durations) / len(durations), 'min': min(durations), 'max': max(durations) } monitor = PerformanceMonitor() def monitored_function(func): def wrapper(*args, **kwargs): start_time = time.time() result = func(*args, **kwargs) end_time = time.time() monitor.record(func.__name__, end_time - start_time) return result return wrapper

Summary

Key points of Python performance optimization:

  1. Performance Analysis Tools: timeit, cProfile, memory_profiler, line_profiler
  2. Algorithm Optimization: Choose appropriate algorithms, use built-in functions, avoid unnecessary computations
  3. Data Structure Optimization: Use appropriate data structures, use generators, use slots
  4. I/O Optimization: Batch processing, use buffering, asynchronous I/O
  5. Concurrency Optimization: Multiprocessing, multithreading, concurrent.futures
  6. Caching Optimization: lru_cache, custom caching, Redis caching
  7. String Optimization: Use join, string formatting, string methods
  8. Database Optimization: Connection pooling, batch insertion, use indexes
  9. Best Practices: Pre-allocate memory, avoid global variables, use appropriate data types, lazy loading
  10. Performance Monitoring: logging, performance counters

Performance optimization principles:

  • Measure first, then optimize
  • Optimize bottlenecks, not all code
  • Balance readability and performance
  • Use built-in functions and libraries
  • Consider using C extensions or Cython

Mastering performance optimization techniques enables writing more efficient and faster Python programs.

标签:Python