Listing Symbols in an ITCH 5.0 File¶
This notebook shows how to extract all available symbols from an ITCH 5.0 file. To process ITCH 4.1 files, use the itch41
module instead.
The example uses a sample ITCH 5.0 file that should be placed in the data/
subdirectory. You can download a sample file from Nasdaq.
In [6]:
Copied!
from pathlib import Path
from meatpy.itch50 import ITCH50MessageReader
# Define the path to our sample data file
data_dir = Path("data")
file_path = data_dir / "S081321-v50.txt.gz"
print(f"✅ Found sample file: {file_path}")
print(f"File size: {file_path.stat().st_size / (1024**3):.2f} GB")
from pathlib import Path
from meatpy.itch50 import ITCH50MessageReader
# Define the path to our sample data file
data_dir = Path("data")
file_path = data_dir / "S081321-v50.txt.gz"
print(f"✅ Found sample file: {file_path}")
print(f"File size: {file_path.stat().st_size / (1024**3):.2f} GB")
✅ Found sample file: data/S081321-v50.txt.gz File size: 4.55 GB
In [7]:
Copied!
symbols = set()
message_count = 0
print("Reading ITCH file to extract symbols...")
with ITCH50MessageReader(file_path) as reader:
for message in reader:
message_count += 1
# Stock Directory messages (type 'R') contain symbol information
if message.type == b"R":
symbol = message.stock.decode().strip()
symbols.add(symbol)
if message_count >= 100000:
break
print(f"Found {len(symbols)} symbols after processing {message_count:,} messages")
symbols = sorted(symbols)
symbols = set()
message_count = 0
print("Reading ITCH file to extract symbols...")
with ITCH50MessageReader(file_path) as reader:
for message in reader:
message_count += 1
# Stock Directory messages (type 'R') contain symbol information
if message.type == b"R":
symbol = message.stock.decode().strip()
symbols.add(symbol)
if message_count >= 100000:
break
print(f"Found {len(symbols)} symbols after processing {message_count:,} messages")
symbols = sorted(symbols)
Reading ITCH file to extract symbols... Found 11096 symbols after processing 100,000 messages
In [8]:
Copied!
print("First 20 symbols:")
for symbol in symbols[:20]:
print(symbol)
print("First 20 symbols:")
for symbol in symbols[:20]:
print(symbol)
First 20 symbols: A AA AAA AAAU AAC AAC+ AAC= AACG AACIU AACOU AADR AAIC AAIC-B AAIC-C AAIN AAL AAMC AAME AAN AAOI
Key Points¶
- Stock Directory Messages: ITCH files begin with Stock Directory messages (type 'R') that contain symbol information
- Early Termination: Since these messages appear at the beginning, we can stop reading after processing a reasonable number of messages (e.g., 100,000) to avoid unnecessary processing. Note: This is not guaranteed in the specification.
- Memory Efficiency: This approach is memory-efficient for large files since we don't need to process the entire file
- Symbol Format: ITCH symbols are 8-byte fields, padded with spaces, which we strip for display
Next Steps¶
Once you have the list of symbols, you can:
- Filter the file to extract data for specific symbols of interest
- Process order book data for particular symbols
- Generate reports or visualizations for selected symbols