Skip to content

Releases: capitalone/DataProfiler

0.10.6

13 Nov 21:24
302a458

Choose a tag to compare

Profiler

  • Staging/main/0.10.6 #1065
  • Update: Version 0.10.6 #1064
  • Feature: Plugins #1060
  • Hot Fix: Contribution Doc #1059
  • Rename references to degree of freedom from df to deg_of_free #1056
  • add_s3_connection_remote_loading_s3uri_feature #1054
  • feat: add null ratio to column stats #1052
  • Delay transforming priority_order into ndarray #1045
  • Fix Codeowners List #1043

Documentation

  • Update: Documentation 0.10.6 #1066
  • Docs: AWS S3 Data Reading #1063
  • Update docs to reflect renamed output of deg_of_free #1057

Full Changelog: 0.10.5...0.10.6

What's Changed

Full Changelog: 0.10.5...0.10.6

0.10.5

25 Sep 15:37
9bca4e7

Choose a tag to compare

Profiler

Documentation

  • Update docs 0.10.5 #1042
  • Update docs 0.10.5 #1041

Full Changelog: 0.10.4...0.10.5

What's Changed

Full Changelog: 0.10.4...0.10.5

0.10.4

22 Sep 12:01
02b7070

Choose a tag to compare

Profiler

  • version bump (#1032) #1036
  • Staging/main/0.10.4 #1029
  • added psi calculation to categorical columns #1027
  • Bump actions/checkout from 3 to 4 #1024
  • Minor: Profiler Path Fix in Example Notebook #1021
  • modified the assignees for issue creation #1016
  • Make sure random_state is a list before indexed assignment #968

Documentation

  • Update docs 0.10.4 #1038
  • Update docs 0.10.4 #1037
  • update install instructions for mac #1026

Full Changelog: 0.10.3...0.10.4

What's Changed

Full Changelog: 0.10.3...0.10.4

0.10.3

07 Aug 19:10
b0b8510

Choose a tag to compare

Profiler

  • Staging: main 0.10.3 #1004
  • Fix ProfilerOptions() documentation #1002

Feature: Multiprocess

  • Staging: into dev feature/multiprocess #998
  • Multiprocess automation feature into staging/dev. #997
  • Syncing feature/multiprocess into staging/dev/multiprocess #992
  • Automate multiprocess option #984

Feature: num_quantiles option

  • Staging: into dev feature/num-quantiles #990
  • Fix Scipy Mend Issue #988
  • HistogramAndQuantilesOption sync with dev branch #987

Documentation

  • Update docs to 0.10.3 #1012
  • Update docs to 0.10.3 #1011
  • fixed snappy install issue on Mac #1010
  • Staging: into dev-gh-pages the docs for multiprocess. #1001
  • Add docs to multiprocess option in StructuredOptions. #999
  • Staging: into dev-gh-pages the docs for num_quantiles. #993
  • Add docs for num_quantiles option for histogram_and_quantiles. #991

Full Changelog: 0.10.2...0.10.3

What's Changed

Full Changelog: 0.10.2...0.10.3

0.10.2

28 Jul 16:18
ec47d45

Choose a tag to compare

Profiler

  • hotfix[0.10.2]: cat vs float bug #973

Documentation

  • Staging: Update docs to 0.10.2 #978
  • Update docs to 0.10.2 #979

Full Changelog: 0.10.1...0.10.2

What's Changed

Full Changelog: 0.10.1...0.10.2

0.10.1

17 Jul 18:21
6cb789a

Choose a tag to compare

Profiler

  • Hot Fix: .astype("bool") #960

Documentation

  • Staging: Update docs 0.10.1 #961
  • Update docs 0.10.1 #962

Full Changelog: 0.10.0...0.10.1

What's Changed

Full Changelog: 0.10.0...0.10.1

0.10.0

30 Jun 15:04
77ddb29

Choose a tag to compare

Profiler

  • Forking workflow directions CONTRIBUTING.md #857
  • Fixing diagram rendering in CONTRIBUTING.md #862
  • Fix initial value of processor_type #863
  • fix: test bug due to bad mocks #878
  • added differences section to unstructured data example #877
  • Reservoir sampling refactor #910
  • feat: add dev to workfow for testing #897
  • Cms for categorical #892
  • Hotfix: fix post feature serialization merge #942
  • Update version to 0.10.0 #944
  • Staging/main/0.10.0 #943

Profiler: Profile Serialization

  • Staging/dev/profile serialization #940
  • fix: order bug #939
  • fix: null_rep mat should calculate even if datetime #933
  • Profiler: load_method hotfix #932
  • Top level hotfix: save / load .lower() #931
  • Notebook Example save/load Profile #930
  • refactor: use seed for sample for consistency #927
  • Profile Builder load() serialization #925
  • Reuse passed labeler #924
  • BaseProfiler save() for json #923
  • Added testing for values for test_json_decode_after_update #915
  • UnstructuredProfiler: Added NoImplementationError #907
  • fix: bug and add tests for structuredcolprofiler #904
  • Stuctured profiler encode decode #903
  • refactor: allow options to go through all #902
  • StructuredColProfiler Encode / Decode #901
  • Decode options #894
  • Quick Test update #893
  • Deserialization of datalabeler #891
  • ColumnDataLabelerCompiler: serialize / deserialize #888
  • Add Serialization and Deserialization Tests for Stats Compiler, plus refactors for order Typing #887
  • Adds deserialization for compilers and validates tests for Primitive; fixes numerical deserialization #886
  • Adds tests validating serialization with Primitive type for compiler #885
  • feat: add test and compiler serialization #884
  • ready datalabeler for deserialization and improvement on serializatio… #879
  • Encode Options #875
  • Encode/Decode TextColumnProfiler #870
  • Created encoder for the datalabelercolumn #869
  • Added test to ensure order attribute for ordered column profiler functions correctly after deserialization #868
  • Added decoding for encoding of ordered column profiles #864
  • Json decode date time column #861
  • Float column profiler encode decode #854
  • hot fixes for encode and decode of numeric stats mixin and intcol pro… #852

Profiler: Options

  • staging/dev/options #909
  • RowStatisticsOptions: Implementing option #871
  • New preset implementation and test #867
  • RowStatisticsOptions: Add option #865

Documentation

  • Staging update docs 0.10.0 #945
  • Documentation: Fix Req #922
  • Documentation: Update for Reservoir Sampling #919
  • documentation update for cms specific options to category #917
  • Add forking / branch workflow image #858

Documentation: Profile Serialization

  • Merge staging/dev-gh-pages/profile-serialization into dev-gh-pages #937
  • Docs: Profiler Serialization Clean Up #936
  • Docs: Profiler Serialization #928

Documentation: Options

  • Documentation: feature/options branch docs updates #921
  • Row statistics option documentation #883
  • updating docs for preset name #882
  • Add documentation for median_abs_deviation option #881
  • Preset test updated w new names and different toggles #880
  • reset ignore, update .gitignore, update documentation on presets #874
  • Fixed documentation for sampling_ratio option #873

Full Changelog: 0.9.0...0.10.0

What's Changed

Full Changelog: 0.9.0...0.10.0

0.9.0

01 Jun 16:05
4d157c8

Choose a tag to compare

Profiler

  • Encode int column #780
  • Decode categorical #786
  • Encode update format #789
  • Optimization for text column profile ksneab #791
  • Remove unnecessary cast() in csv_data.py (1) #796
  • Remove unnecessary cast() in csv_data.py (2) #798
  • Update main with change in memory-optimization #799
  • Remove unnecessary cast() in data.py #800
  • Remove unnecessary cast() in graph_data.py #801
  • Fix CatgoricalColumn test #804
  • Specify init calls in data readers reload() methods #805
  • Fix dask dataframe import #812
  • Fix CharsetMatches type error #813
  • Json Decoder Code Cleanup #814
  • Fix override errors #819
  • Sampling ratio option #825
  • Memory Optimization to main #832
    • Fixed testing to run on all feature branches for PRs #793
    • Part 1 fix for categorical mem opt issue #795
    • cleanup time space analysis code #797
    • quick update to feature/memory-optimization for merge to main #802
    • Update feat mem #803
    • Categorical Stop Condition Options #808
    • Space time analysis improvement #809
    • implementation of setting stop conds via options for cat column profiler #810
    • Fix for histogram merging #815
    • Fixes categorical bug when stop condition is met #816
    • hotfix for more conservatitive stop condition in categorical columns #817
    • Coverage Fix Memory Optimization Feature Branch #823
    • Added option to remove calculations for updating row statistics #827
    • Fix to doc strings #829
    • Preset Option Fix: presets docsstring added #830
  • Fix LSP violations #840
  • Fix argument types in doc comments #843

Documentation

  • Fix minor typo #788
  • Github pages memory optimization #833
    • added new options to docs #828
    • Preset Option Fix: Added presets documentation to profiler options section #831
  • Update docs for 0.9.0 #851

Other Changes

  • Memory testing and data gen scripts #781
  • Update for new Dask version in Validator test #784
  • Space analysis dataset sampling addition #787
  • fix bug in dataset generation #790
  • Update pre-commit mypy dependencies #811
  • Coverage Fix to Main Branch #822
  • Update version to 0.9.0 #848

Full Changelog: 0.8.9...0.9.0

What's Changed

New Contributors

Full Changelog: 0.8.9...0.9.0

0.8.9

12 Apr 15:18
a7f0d3e

Choose a tag to compare

Profiler

  • Create BaseColumnProfiler.to_dict to make JSONable #766
  • Chi2 docs update #767
  • Create Profile Encoder to JSONify BaseColumnProfiler #769
  • Encode categorical column #770
  • Encode order column #772
  • Add and test JSONify DateTimeColumn #774

Documentation

  • Update docs 0.8.9 #779

Other Changes

  • fix: update ml reqs #777
  • Update to version 0.8.9 #778

Full Changelog: 0.8.8...0.8.9

What's Changed

New Contributors

Full Changelog: 0.8.8...0.8.9

0.8.8

21 Feb 22:56
7613a1a

Choose a tag to compare

Profiler

  • Quick chi2 test fix #763

Documentation

  • Update docs 0.8.8 #765
  • Chi2 docs update #767

Other Changes

  • Update to version 0.8.8 #764
  • PyPi image rendering issue #761
  • [BUG] update isort version pin #760
  • [BUG] isort version change #759

Full Changelog: 0.8.7.post1...0.8.8

What's Changed

Full Changelog: 0.8.7.post1...0.8.8