Changelog
4.7.8
Hotfix: DataResource was still using BONSAI_HOME -> using DATAIO_ROOT
4.7.7
Hotfix for BACITrade product int -> str
4.7.6
hotfix for an error in group_and_sum method that caused a ValueError
4.7.5
Expose airflow config via static method in Config class
Performance update for conversion of units, currencies and location
Update external schemas for BACI and ADB
Add external schema for IO tables from ADB
4.7.3
Changed to_dataclass from iterrows to TypeAdapter.
fixed merged error in config and added load_env again.
fixed classifications dicts in internal schemas
4.7.2
Reverted product_code -> product in external monetary schemas classifications.
4.7.1
Updated
StatCanChemProductionVolume
external schema - added product correspondence
4.7.0
Deprecated CSVResourceRepository please use ResourceRepository instead
Updated README
Introduced option for storing data directly in the database via the API
Update schema for EuropeanMonetarySUT, to point to the correct classification
Add external schema for African SUTs (ASUT)
4.6.4
updated external schemas classification naming and IndustrialCommodityStatistic location to short_name
4.6.3
removed monetary_value and monetary_unit from trade and production_volumes
add optional source to both production volumes and trades
4.6.2
added optional accounttype column to use, supply, productionvolumes and trade schemas
4.6.1
Added the UnfcccProductionVolume external schema
4.6.0
Added config from hybrid_sut to make integration smoother
4.5.15
added diagonal to the external SUT schemas
added the prodcom trade external schema
4.5.14
undo changes on the schema of
Emissions_samples
andEmissions_uncertainty
4.5.13
added external schema UNdataWDI
revised pairwise concordance loading
added diagonal to the external SUT schemas
added the prodcom trade external schema
added a new schema “ContentData”
4.5.12
add transfer_type to transferCoefficients
handle import_location and export_location as locations
add classification to trade external schemas for locations
add convert_dataframe ability to handle trade locations
4.5.11
Correction of FAOSTAT dat schema (for classification)
4.5.10
Modified external schemas for PRODCOM location codes: GEOnumeric
4.5.9
Fixed tests for new version of classification package
Fixed units not being parameter of
load_with_classification
Handle one-to-many correspondences in generate_classification_mapping()
fixed
load_table_file()
not working withsamples
columns
4.5.8
Modified external schema field: final_user
4.5.7
added monetary fields to production volume and trade
added api_endpoint and removed description from dataResource
4.5.6
Added external schemas for FAOstat data: FAOstat and FAOtrade
4.5.5
added external schema comext
4.5.4
added external schema StatCanChemProductionVolume
added location to USGS external schema
4.5.3
added
resource_repository.get_dataframe_for_resource(self, resource: DataResource)
version 4.5.2
resource now inherits from emissions and added elementary type
version 4.5.1
Add a new category of schemas that include the sample vector. All the old schemas have been renamed to ‘oldname_uncertainty’ and all the new schemas are called ‘oldname_samples’
4.5.0
added functionality to allow dataio do deal with concordance pair matrices
added unit conversion. You now have the possibility to use the following APIs to convert units
pass a “units” list with the set of target units to the
load_with_classification
e.g.["tonne", "km", "EUR"]
directly use
resource_repository.convert_units(data: pd.DataFrame, target_units: list[str])
use the
resource_repository.valid_units()
to get a list of available units
4.4.1
fixed
get_empty_dataframe()
not providing correct dtypes
4.4.0
reworked classification converting and introduced a new API
convert_dataframe
andconvert_dataframe_to_bonsai_classification
that can be used to directly convert a dataframe without having to use the load method.
version 4.3.3
New external schemas for SUTs, inheriting from ExternalMonetarySUT
New internal schemas for components of SUTs
4.3.2
renamed classification for USGSProductionVolume schema
4.3.1
fixed bug with windows paths not being interpreted correctly
4.3.0
load_with_classification and load_with_bonsai_classification now have a new way of interacting
version 4.2.0
load_with_classifications now allows lists in the dict values which allows multiple classifications of same column in a single call.
version 4.1.3
Removed quantity from the trade schema
version 4.1.2
fixed self.available_resources is a now not a dict but an empty dataframe when freshly initialized
version 4.1.1
updated classifications requirements in setup.cfg
version 4.1.0
added load_with_classification and load_with_bonsai_classification methods to csvrepository along with tests.
added transform_dataframe method to BonsaiBaseModel.
added harmonize_with_resource to csvrepository
added many smaller helper methods for these
version 4.0.6
NA in input files is no longer detected as NaN value. Full list of currently detected NaN values: [“”, “NaN”, “N/A”, “n/a”, “nan”]
version 4.0.5
Fixed deprecation warnings for pydantic
Added minimum required version for pydantic to avoid install-errors
version 4.0.4
-Bugfix ExternalDimensionTables inheritance.
version 4.0.3
-Bugfix init import * schemas
version 4.0.2
-Added PropertyOfProducts schema -Added associated_product to use and supply. -Confirmed trade is the correct schema -Added product destination to supply -Added ExternalDimensionTables
version 4.0.1
Added BrokenYearMonetarySUT as external schema which is used for specific type of monetary SUT, for tables with annual data that start on a day other than January 1st. For example for the fiscal year of India.
version 4.0.0
added B_Matrix schema to MatrixModel
updated PPF-fact-schema with units for activities and products
version 3.3.0
BREAKING CHANGE: current task name is now required by Config!
added/updated arg_parser and set_logger from utilities
fixed bug with loading csv not interpreting nan values correctly
fixed bug with loading csv didn’t parse bool values correctly
version 3.2.0
Added feature to save matrices, by specifying the datatype to be .h5 in the location field of a resource
Also added MatrixModels to schema
A_Matrix
Inverse
IntensitiesMatrix
version 3.1.7
Added external schema USGSProductionVolume
version 3.1.6
Fixed bug where last_update column to resources.csv was not saved and loaded in iso format
version 3.1.5
Fixed bug when only single resource was loaded with get_resource_info
version 3.1.4
Added external schemas for: external monetary SUT, PRODCOM production volumes, UN and BACI data
version 3.1.3
Bugfix that ensures that the resource name and the file name does not need to be identical
version 3.1.2
added schemes for classifications
revisions of names for schemes to synchronize with Bonsai ontology
version 3.1.1
changed behaviour of DataResource location. It is now possible to specify a root_location that is used to create absolute paths from relative locations.
with this change, location in resources.csv for CSVResourceRepository are now all relative to the resources.csv file!
version 3.1.0
feature change: load and all dependent methods now return a dict if more than one file is loaded, otherwise returns just a dataframe.
version 3.0.4
bug fix -> uncertainty columns from get_empty_dataframe not double anymore
version 3.0.3
bug fix -> location now stored in relative format
bug fix -> updating resource now works as expected
resource_exists function now only works with non-metadata information (e.g. comment field is ignored when checking)
version 3.0.2 (resource_exists)
bug fixes
added resource_repository.resource_exists(resource) -> bool.
version 3.0.1 (flexible location)
bug fixes
added the possibility to use variables in the location name (e.g.
clean/production/{version}/industry.csv
) is now possible.
version 3.0.0 (rework of dataio)
This version potentially breaks old code that relies on the dataio package.
CSV Resource Repository: Introduced a new
CSVResourceRepository
class to manage data resources stored in CSV files, enabling adding, updating, and listing of resources.Data Validation: Implemented data validation methods that ensure the integrity of data before it is saved, aligning with predefined schemas.
Environment Setup: Added functionality to set the
BONSAI_HOME
environment variable for specifying the project’s home directory.Resource Manipulation Methods: New methods for adding, updating, and retrieving data resources, including:
add_to_resource_list
update_resource_list
get_resource_info
Data Handling Methods: Developed methods to read and write DataFrame objects directly related to specific tasks and resources, enhancing ease of data manipulation:
write_dataframe_for_task
get_dataframe_for_task
Test Suite: A comprehensive test suite has been added to ensure the reliability and performance of the new functionalities.
Documentation: Provided detailed README.md documentation to assist users in understanding and utilizing the new features effectively.
Version 2.0.3 (fixed bugs)
read csv files into pandas dataframe properly (e.g. avoid converting
NA
into NaN)raise Error when
frictionless.validate
fails
Version 2.0.2 (minor changes)
added future work
changed tox.ini
Version 2.0.1 (included datapackage dependencies)
edited validate to check foreign keys to field datapackages in metadata
implemented dialect handling (delimiter, quotation character and skipinitialvalue)
edited docs (syntax, future developments, tutorials)
updated tutorials
improved user experience (renamed validate report, provided plot output options)
Version 1.2.7 (small fix)
create/dump renaming bug
Layout of create docstring is not good
Version 1.2.6 (fixed log refs)
Older terms like dump and visualize were fixed to save and plot
Version 1.2.5 (log path of saved metadata and tables)
changed log strings
Version 1.2.4 (fixed datapackage save index issues)
‘id’ is index, now column exported with proper name
Version 1.2.3 (fixed datapackage create docstring)
Concerning tables
Version 1.2.1 (fixed bug in datapackage create function)
Indent was misplaced
Version 1.2.0 (revised tutorials and added metadata syntax)
Additionaly revised terminology and simplified API of several functions
Version 1.1.0 (added features to ‘dump’ and ‘describe’)
Added instructions on use of help() and dir() to ‘load’ toy model
Added auto-increment option to ‘dump’
Added option to override metadata fields to ‘describe’
Version 1.0.7 (improved user-friendliness)
Changed version dependency of pandas in setup.cfg install_requires from >= to None
Changed validate to raise exception only after running all pre-frictionless tests
Changed describe to accept absolute paths
Changed all functions to allow both Path and string paths
Removed output of
.datapackage.yaml from describe
Version 1.0.6 (made dependencies flexible)
Changed version dependencies in setup.cfg install_requires from == to >=
Version 1.0.5 (fixed web documentation)
Renamed subfolder with datapackage
Reformatted docstrings
Version 1.0.4 (generate latex documentation)
Edited several configuration files
Version 1.0.3 (added jupyter notebooks)
Added jupyter notebooks to load and dump tutorials
Added print statements to load and dump tutorials
Renamed distro to match package name to fix website version number issue
Version 1.0.2 (fixed instructions)
Fixed instructions for installing package
BUG: API is not rendering in website
BUG: website is showing unknown version
Version 1.0.0 (datapackage is working)
Functions describe, validate, visualize, load, dump working for datapackage
Tests using export illustrations working
Version 2.1
Added pydantic data classes for the PPF section of the workflow.
Added utility methods for data classes that allow for transformation from json to dataclasses, from datacalss to pandas and the other way, along with other utilities.
Added tests for all the utility functions.
Added first draft of Uncertainty data class.