{ "cells": [ { "cell_type": "markdown", "id": "91f05010-b82d-42f3-b882-4f689eaa946c", "metadata": {}, "source": [ "(migration_guide)=\n", "# ArviZ migration guide\n", "\n", "We have been working on refactoring ArviZ to allow more flexibility and extensibility of its elements\n", "while keeping as much as possible a friendly user-interface that gives sensible results with little to no arguments.\n", "\n", "One important change is enhanced modularity. Everything will still be available through a common namespace `arviz`,\n", "but ArviZ will now be composed of 3 smaller libraries:\n", "\n", "* [arviz-base](https://arviz-base.readthedocs.io/en/latest/) data related functionality, including converters from different PPLs.\n", "* [arviz-stats](https://arviz-stats.readthedocs.io/en/latest/) for statistical functions and diagnostics.\n", "* [arviz-plots](https://arviz-plots.readthedocs.io/en/latest/) for visual checks built on top of arviz-stats and arviz-base.\n", "\n", "Each library depends only on a minimal set of libraries, with a lot of functionality built on top of optional dependencies.\n", "This keeps ArviZ smaller and easier to install as you can install only the components you really need. The main examples are:\n", "\n", "* `arviz-base` has no I/O library as a dependency, but you can use `netcdf4`, `h5netcdf` or `zarr` to read and write your data, allowing you to install only the one you need.\n", "* `arviz-plots` has no plotting library as a dependency, but it can generate plots with `matplotlib`, `bokeh` or `plotly` if they are installed.\n", "\n", "At the time of writing, `arviz-xyz` libraries are independent of the `arviz` library, but `arviz` tries to import the `arviz-xyz` libraries\n", "and exposes all their elements through the `arviz.preview` namespace. In the future, with the ArviZ 1.0 release, the `arviz` namespace will look\n", "like `arviz.preview` looks like today.\n", "\n", "We encourage you to try it out and get a head start on the migration!" ] }, { "cell_type": "code", "execution_count": 1, "id": "4074f836-233b-4b10-8483-d2177cad7424", "metadata": {}, "outputs": [], "source": [ "import arviz.preview as az\n", "# change to import arviz as az after ArviZ 1.0 release" ] }, { "cell_type": "markdown", "id": "a28ab6bd-d3c1-4f71-981f-3444f39ee249", "metadata": {}, "source": [ "Check all 3 libraries have been exposed correctly:" ] }, { "cell_type": "code", "execution_count": 2, "id": "7c792e9f-9a22-4ba0-ad11-138fbc510784", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "arviz_base available, exposing its functions as part of arviz.preview\n", "arviz_stats available, exposing its functions as part of arviz.preview\n", "arviz_plots available, exposing its functions as part of arviz.preview\n", "\n" ] } ], "source": [ "print(az.info)" ] }, { "cell_type": "markdown", "id": "dba357f5-cc19-4fe3-918c-6a540723c3e9", "metadata": {}, "source": [ "## `arviz-base`" ] }, { "cell_type": "markdown", "id": "b73c5b99-234c-4b5d-ab3e-adfce2fb2edc", "metadata": {}, "source": [ "### `DataTree`\n", "One of the main differences is the `arviz.InferenceData` object doesn't exist any more.\n", "`arviz-base` uses `xarray.DataTree` instead. This is a new data structure in xarray,\n", "so it might still have some rough edges, but it is much more flexible and powerful.\n", "To give some examples, I/O will now be more flexible, and any format supported by\n", "xarray is automatically available to you, no need to add wrappers on top of them within ArviZ.\n", "It is also possible to have arbitrary nesting of variables within groups and subgroups.\n", "\n", ":::{important}\n", "Not all the functionality on `xarray.DataTree` will be compatible with ArviZ as it would be too much\n", "work for us to cover and maintain. If there are things you have always wanted to do but\n", "were not possible with `InferenceData` and are now possible with `DataTree` please try\n", "them out, give feedback on them and on desired behaviour for things that still don't work.\n", "After a couple releases the \"ArviZverse\" will stabilize much more and it might not be\n", "possible to add support for that anymore.\n", ":::\n", "\n", "#### I already have `InferenceData` object from an external library\n", "`InferenceData` already has a method to convert it to DataTree." ] }, { "cell_type": "code", "execution_count": 3, "id": "b7b8a1a1-3f90-497e-9321-2dbf9872541b", "metadata": {}, "outputs": [], "source": [ "import arviz as arviz_legacy" ] }, { "cell_type": "code", "execution_count": 4, "id": "e0d579b2-d178-46f1-bc87-5bb8d2db28b1", "metadata": {}, "outputs": [], "source": [ "idata = arviz_legacy.load_arviz_data(\"centered_eight\")" ] }, { "cell_type": "code", "execution_count": 5, "id": "7fd60b96-bb57-4859-b829-7a0fa7879b94", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DatasetView> Size: 0B\n", "Dimensions: ()\n", "Data variables:\n", " *empty*
<xarray.DatasetView> Size: 0B\n", "Dimensions: ()\n", "Data variables:\n", " *empty*" ], "text/plain": [ "
<xarray.DatasetView> Size: 0B\n", "Dimensions: ()\n", "Data variables:\n", " *empty*" ], "text/plain": [ "
\n", " | mu | \n", "theta[Choate] | \n", "theta[Deerfield] | \n", "theta[Phillips Andover] | \n", "theta[Phillips Exeter] | \n", "theta[Hotchkiss] | \n", "theta[Lawrenceville] | \n", "theta[St. Paul's] | \n", "theta[Mt. Hermon] | \n", "tau | \n", "
---|---|---|---|---|---|---|---|---|---|---|
(0, 0) | \n", "7.871796 | \n", "12.320686 | \n", "9.905367 | \n", "14.951615 | \n", "11.011485 | \n", "5.579602 | \n", "16.901795 | \n", "13.198059 | \n", "15.061366 | \n", "4.725740 | \n", "
(0, 1) | \n", "3.384554 | \n", "11.285623 | \n", "9.129324 | \n", "3.139263 | \n", "9.433211 | \n", "7.811516 | \n", "2.393088 | \n", "10.055223 | \n", "6.176724 | \n", "3.908994 | \n", "
(0, 2) | \n", "9.100476 | \n", "5.708506 | \n", "5.757932 | \n", "10.944585 | \n", "5.895436 | \n", "9.992984 | \n", "8.143327 | \n", "7.604753 | \n", "8.767647 | \n", "4.844025 | \n", "
(0, 3) | \n", "7.304293 | \n", "10.037275 | \n", "8.809068 | \n", "9.900924 | \n", "5.768832 | \n", "9.062876 | \n", "6.958424 | \n", "10.298256 | \n", "3.155304 | \n", "1.856703 | \n", "
(0, 4) | \n", "9.879675 | \n", "9.149146 | \n", "5.764986 | \n", "7.015397 | \n", "15.688710 | \n", "3.097395 | \n", "12.025763 | \n", "11.316745 | \n", "17.046142 | \n", "4.748409 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
(3, 495) | \n", "1.542688 | \n", "3.737751 | \n", "5.393632 | \n", "0.487845 | \n", "4.015486 | \n", "0.717057 | \n", "-2.675760 | \n", "0.415968 | \n", "-4.991247 | \n", "2.786072 | \n", "
(3, 496) | \n", "1.858580 | \n", "-0.291737 | \n", "0.110315 | \n", "1.468877 | \n", "-3.653346 | \n", "1.844292 | \n", "6.055714 | \n", "4.986218 | \n", "9.290380 | \n", "4.281961 | \n", "
(3, 497) | \n", "1.766733 | \n", "3.532515 | \n", "2.008901 | \n", "0.510806 | \n", "0.832185 | \n", "2.647687 | \n", "4.707249 | \n", "3.073314 | \n", "-2.623069 | \n", "2.740607 | \n", "
(3, 498) | \n", "3.486112 | \n", "4.182751 | \n", "7.554251 | \n", "4.456034 | \n", "3.300833 | \n", "1.563307 | \n", "1.528958 | \n", "1.096098 | \n", "8.452282 | \n", "2.932379 | \n", "
(3, 499) | \n", "3.404464 | \n", "0.192956 | \n", "6.498428 | \n", "-0.894424 | \n", "6.849020 | \n", "1.859747 | \n", "7.936460 | \n", "6.762455 | \n", "1.295051 | \n", "4.461246 | \n", "
2000 rows × 10 columns
\n", "<xarray.DatasetView> Size: 656B\n", "Dimensions: (school: 8)\n", "Coordinates:\n", " * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'\n", "Data variables:\n", " mu float64 8B 1.65e+03\n", " theta_t (school) float64 64B 2.058e+03 2.51e+03 ... 2.455e+03 2.757e+03\n", " tau float64 8B 1.115e+03\n", " theta (school) float64 64B 1.942e+03 2.199e+03 ... 2.079e+03 2.106e+03
<xarray.DatasetView> Size: 96B\n", "Dimensions: (chain: 4)\n", "Coordinates:\n", " * chain (chain) int64 32B 0 1 2 3\n", "Data variables:\n", " theta (chain) float64 32B 1.129e+03 408.2 329.2 580.9\n", " theta_t (chain) float64 32B 499.2 339.0 430.1 1.052e+03
<xarray.DatasetView> Size: 840B\n", "Dimensions: (ci_bound: 2, school: 8)\n", "Coordinates:\n", " * ci_bound (ci_bound) <U5 40B 'lower' 'upper'\n", " * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'\n", "Data variables:\n", " mu (ci_bound) float64 16B -2.028 10.27\n", " theta_t (school, ci_bound) float64 128B -1.565 2.493 ... -1.761 2.012\n", " tau (ci_bound) float64 16B 0.004998 9.09\n", " theta (school, ci_bound) float64 128B -2.824 17.41 ... -4.839 14.95
<xarray.DatasetView> Size: 328B\n", "Dimensions: (chain: 4, ci_bound: 2)\n", "Coordinates:\n", " * chain (chain) int64 32B 0 1 2 3\n", " * ci_bound (ci_bound) <U5 40B 'lower' 'upper'\n", "Data variables:\n", " mu (chain, ci_bound) float64 64B -1.987 10.23 ... -2.209 10.27\n", " theta_t (chain, ci_bound) float64 64B -1.754 1.931 -1.773 ... -1.606 2.041\n", " tau (chain, ci_bound) float64 64B 0.0111 9.189 ... 0.009194 9.042\n", " theta (chain, ci_bound) float64 64B -4.28 14.89 -4.415 ... -4.737 15.02" ], "text/plain": [ "
<xarray.DatasetView> Size: 656B\n", "Dimensions: (school: 8)\n", "Coordinates:\n", " * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon'\n", "Data variables:\n", " mu float64 8B 1.65e+03\n", " theta_t (school) float64 64B 2.058e+03 2.51e+03 ... 2.455e+03 2.757e+03\n", " tau float64 8B 1.115e+03\n", " theta (school) float64 64B 1.942e+03 2.199e+03 ... 2.079e+03 2.106e+03" ], "text/plain": [ "