Core Concepts

This page covers the building blocks you'll use most often: how the client lazily models requests as DataCollection objects, how to convert them into DataFrames, and how to export them to files.

DataCollection

Every method on CoinMetricsClient returns a DataCollection. A DataCollection is a lazy wrapper around the underlying request: it captures the endpoint and parameters, but it does not hit the API until you iterate over it or call a transformation method (to_list(), to_dataframe(), export_to_csv(), ...).

By default, time-series methods stream their response (format='json_stream'), so a single iteration over a DataCollection typically fetches everything in one continuous response. For example, to get market trades for the Coinbase BTC-USD pair:

for trade in client.get_market_trades(
    markets='coinbase-btc-usd-spot',
    start_time='2020-01-01',
    end_time='2020-01-03',
    limit_per_market=10,
):
    print(trade)

The same pattern works for daily metrics:

for metric_data in client.get_asset_metrics(
    assets='btc',
    metrics=['ReferenceRateUSD', 'BlkHgt', 'AdrActCnt', 'AdrActRecCnt', 'FlowOutBFXUSD'],
    frequency='1d',
    limit_per_asset=10,
):
    print(metric_data)

Exploring Available Data

Use the catalog_*_v2 methods to discover what data is available. For example, to list markets that report trades data:

You can also filter by exchange, base, or quote:

The catalog-v2 endpoints are designed to feed the historical-data endpoints. For example, to fetch one hour of all BTC market trades from Coinbase:

DataFrames

Pandas

DataCollection.to_dataframe() materializes the response as a pandas DataFrame:

You can use the full pandas API to filter, transform, and persist the result:

Time-series data converts the same way:

Column types are derived from the endpoint's schema. To override types for specific columns, pass a dtype_mapper:

Polars

Polarsarrow-up-right is a high-performance DataFrame library and a more performant alternative to pandas in many cases. Pass dataframe_type="polars" to to_dataframe():

LazyFrames

DataFrames eagerly load data into memory. Lazy execution defers materialization, which is useful when you want to apply intermediate transformations to large datasets before collecting. Convert a DataCollection into a polars LazyFramearrow-up-right with to_lazyframe(). See the Best Practices guide for an in-depth example.

Data Exports

You can stream a DataCollection directly to a CSV, JSON, or Parquet file using export_to_csv(), export_to_json(), and export_to_parquet():

For large historical exports across many assets or markets, see the Parallel Execution section in the Best Practices guide, which uses parallel().export_to_csv_files(), parallel().export_to_json_files(), and parallel().export_to_parquet_files() to write one file per worker.

Last updated

Was this helpful?