Releases: capitalone/DataProfiler
Releases · capitalone/DataProfiler
0.10.6
Profiler
- Staging/main/0.10.6 #1065
- Update: Version 0.10.6 #1064
- Feature: Plugins #1060
- Hot Fix: Contribution Doc #1059
- Rename references to degree of freedom from df to deg_of_free #1056
- add_s3_connection_remote_loading_s3uri_feature #1054
- feat: add null ratio to column stats #1052
- Delay transforming priority_order into ndarray #1045
- Fix Codeowners List #1043
Documentation
- Update: Documentation 0.10.6 #1066
- Docs: AWS S3 Data Reading #1063
- Update docs to reflect renamed output of deg_of_free #1057
Full Changelog: 0.10.5...0.10.6
What's Changed
- Fix Codeowners List by @taylorfturner in #1044
- Staging/main/0.10.6 by @taylorfturner in #1065
Full Changelog: 0.10.5...0.10.6
0.10.5
Profiler
Documentation
Full Changelog: 0.10.4...0.10.5
What's Changed
- Categorical PSI by @taylorfturner in #1040
Full Changelog: 0.10.4...0.10.5
0.10.4
Profiler
- version bump (#1032) #1036
- Staging/main/0.10.4 #1029
- added psi calculation to categorical columns #1027
- Bump actions/checkout from 3 to 4 #1024
- Minor: Profiler Path Fix in Example Notebook #1021
- modified the assignees for issue creation #1016
- Make sure random_state is a list before indexed assignment #968
Documentation
Full Changelog: 0.10.3...0.10.4
What's Changed
Full Changelog: 0.10.3...0.10.4
0.10.3
Profiler
Feature: Multiprocess
- Staging: into dev feature/multiprocess #998
- Multiprocess automation feature into staging/dev. #997
- Syncing feature/multiprocess into staging/dev/multiprocess #992
- Automate multiprocess option #984
Feature: num_quantiles option
- Staging: into dev feature/num-quantiles #990
- Fix Scipy Mend Issue #988
- HistogramAndQuantilesOption sync with dev branch #987
Documentation
- Update docs to 0.10.3 #1012
- Update docs to 0.10.3 #1011
- fixed snappy install issue on Mac #1010
- Staging: into dev-gh-pages the docs for multiprocess. #1001
- Add docs to multiprocess option in StructuredOptions. #999
- Staging: into dev-gh-pages the docs for num_quantiles. #993
- Add docs for num_quantiles option for histogram_and_quantiles. #991
Full Changelog: 0.10.2...0.10.3
What's Changed
- Staging: main
0.10.3by @taylorfturner in #1004
Full Changelog: 0.10.2...0.10.3
0.10.2
Profiler
- hotfix[0.10.2]: cat vs float bug #973
Documentation
Full Changelog: 0.10.1...0.10.2
What's Changed
Full Changelog: 0.10.1...0.10.2
0.10.1
Profiler
- Hot Fix: .astype("bool") #960
Documentation
Full Changelog: 0.10.0...0.10.1
What's Changed
- Hot Fix:
.astype("bool")by @taylorfturner in #960
Full Changelog: 0.10.0...0.10.1
0.10.0
Profiler
- Forking workflow directions CONTRIBUTING.md #857
- Fixing diagram rendering in CONTRIBUTING.md #862
- Fix initial value of processor_type #863
- fix: test bug due to bad mocks #878
- added differences section to unstructured data example #877
- Reservoir sampling refactor #910
- feat: add dev to workfow for testing #897
- Cms for categorical #892
- Hotfix: fix post feature serialization merge #942
- Update version to 0.10.0 #944
- Staging/main/0.10.0 #943
Profiler: Profile Serialization
- Staging/dev/profile serialization #940
- fix: order bug #939
- fix: null_rep mat should calculate even if datetime #933
- Profiler: load_method hotfix #932
- Top level hotfix: save / load .lower() #931
- Notebook Example save/load Profile #930
- refactor: use seed for sample for consistency #927
- Profile Builder load() serialization #925
- Reuse passed labeler #924
- BaseProfiler save() for json #923
- Added testing for values for test_json_decode_after_update #915
- UnstructuredProfiler: Added NoImplementationError #907
- fix: bug and add tests for structuredcolprofiler #904
- Stuctured profiler encode decode #903
- refactor: allow options to go through all #902
- StructuredColProfiler Encode / Decode #901
- Decode options #894
- Quick Test update #893
- Deserialization of datalabeler #891
- ColumnDataLabelerCompiler: serialize / deserialize #888
- Add Serialization and Deserialization Tests for Stats Compiler, plus refactors for order Typing #887
- Adds deserialization for compilers and validates tests for Primitive; fixes numerical deserialization #886
- Adds tests validating serialization with Primitive type for compiler #885
- feat: add test and compiler serialization #884
- ready datalabeler for deserialization and improvement on serializatio… #879
- Encode Options #875
- Encode/Decode TextColumnProfiler #870
- Created encoder for the datalabelercolumn #869
- Added test to ensure order attribute for ordered column profiler functions correctly after deserialization #868
- Added decoding for encoding of ordered column profiles #864
- Json decode date time column #861
- Float column profiler encode decode #854
- hot fixes for encode and decode of numeric stats mixin and intcol pro… #852
Profiler: Options
- staging/dev/options #909
- RowStatisticsOptions: Implementing option #871
- New preset implementation and test #867
- RowStatisticsOptions: Add option #865
Documentation
- Staging update docs 0.10.0 #945
- Documentation: Fix Req #922
- Documentation: Update for Reservoir Sampling #919
- documentation update for cms specific options to category #917
- Add forking / branch workflow image #858
Documentation: Profile Serialization
- Merge staging/dev-gh-pages/profile-serialization into dev-gh-pages #937
- Docs: Profiler Serialization Clean Up #936
- Docs: Profiler Serialization #928
Documentation: Options
- Documentation: feature/options branch docs updates #921
- Row statistics option documentation #883
- updating docs for preset name #882
- Add documentation for median_abs_deviation option #881
- Preset test updated w new names and different toggles #880
- reset ignore, update .gitignore, update documentation on presets #874
- Fixed documentation for sampling_ratio option #873
Full Changelog: 0.9.0...0.10.0
What's Changed
- Sampling ratio implement by @joshuart in #845
- StructuredOptions:
hhl_row_hashingby @micdavis in #841 - Forking workflow directions CONTRIBUTING.md by @taylorfturner in #857
- Fixing diagram rendering in
CONTRIBUTING.mdby @taylorfturner in #862 - StructuredProfiler: HLLRowHashing by @micdavis in #842
- added differences section to unstructured data example by @lizlouise1335 in #877
- fix: test bug due to bad mocks by @JGSweets in #878
- Fix initial value of processor_type by @junholee6a in #863
- Staging/main/0.10.0 by @taylorfturner in #943
Full Changelog: 0.9.0...0.10.0
0.9.0
Profiler
- Encode int column #780
- Decode categorical #786
- Encode update format #789
- Optimization for text column profile ksneab #791
- Remove unnecessary cast() in csv_data.py (1) #796
- Remove unnecessary cast() in csv_data.py (2) #798
- Update main with change in memory-optimization #799
- Remove unnecessary cast() in data.py #800
- Remove unnecessary cast() in graph_data.py #801
- Fix CatgoricalColumn test #804
- Specify init calls in data readers reload() methods #805
- Fix dask dataframe import #812
- Fix CharsetMatches type error #813
- Json Decoder Code Cleanup #814
- Fix override errors #819
- Sampling ratio option #825
- Memory Optimization to main #832
- Fixed testing to run on all feature branches for PRs #793
- Part 1 fix for categorical mem opt issue #795
- cleanup time space analysis code #797
- quick update to feature/memory-optimization for merge to main #802
- Update feat mem #803
- Categorical Stop Condition Options #808
- Space time analysis improvement #809
- implementation of setting stop conds via options for cat column profiler #810
- Fix for histogram merging #815
- Fixes categorical bug when stop condition is met #816
- hotfix for more conservatitive stop condition in categorical columns #817
- Coverage Fix Memory Optimization Feature Branch #823
- Added option to remove calculations for updating row statistics #827
- Fix to doc strings #829
- Preset Option Fix: presets docsstring added #830
- Fix LSP violations #840
- Fix argument types in doc comments #843
Documentation
Other Changes
- Memory testing and data gen scripts #781
- Update for new Dask version in Validator test #784
- Space analysis dataset sampling addition #787
- fix bug in dataset generation #790
- Update pre-commit mypy dependencies #811
- Coverage Fix to Main Branch #822
- Update version to 0.9.0 #848
Full Changelog: 0.8.9...0.9.0
What's Changed
- Create method to serialize NumericalStatsMixin and functions by @kshitijavis in #776
- Memory testing and data gen scripts by @ksneab7 in #781
- Update for new Dask version in Validator test by @JGSweets in #784
- Encode int column by @kshitijavis in #780
- Fix minor typo by @junholee6a in #788
- Space analysis dataset sampling addition by @ksneab7 in #787
- fix bug in dataset generation by @ksneab7 in #790
- Optimization for text column profile ksneab by @ksneab7 in #791
- Encode update format by @kshitijavis in #789
- Remove unnecessary cast() in csv_data.py (1) by @junholee6a in #796
- Remove unnecessary cast() in csv_data.py (2) by @junholee6a in #798
- Update main with change in
memory-optimizationby @taylorfturner in #799 - Remove unnecessary cast() in data.py by @junholee6a in #800
- Remove unnecessary cast() in graph_data.py by @junholee6a in #801
- Decode categorical by @kshitijavis in #786
- Fix CatgoricalColumn test by @kshitijavis in #804
- Specify init calls in data readers reload() methods by @junholee6a in #805
- Fix dask dataframe import by @junholee6a in #812
- Fix CharsetMatches type error by @junholee6a in #813
- Json Decoder Code Cleanup by @micdavis in #814
- Update pre-commit mypy dependencies by @junholee6a in #811
- Coverage Fix to Main Branch by @taylorfturner in #822
- Fix override errors by @junholee6a in #819
- Memory Optimization to
mainby @taylorfturner in #832 - Fix LSP violations by @junholee6a in #840
- Sampling ratio option by @joshuart in #825
- Fix argument types in doc comments by @junholee6a in #843
- Update version to 0.9.0 by @taylorfturner in #848
New Contributors
- @junholee6a made their first contribution in #788
- @joshuart made their first contribution in #825
Full Changelog: 0.8.9...0.9.0
0.8.9
Profiler
- Create BaseColumnProfiler.to_dict to make JSONable #766
- Chi2 docs update #767
- Create Profile Encoder to JSONify BaseColumnProfiler #769
- Encode categorical column #770
- Encode order column #772
- Add and test JSONify DateTimeColumn #774
Documentation
- Update docs 0.8.9 #779
Other Changes
Full Changelog: 0.8.8...0.8.9
What's Changed
- Create BaseColumnProfiler.to_dict to make JSONable by @kshitijavis in #766
- Create Profile Encoder to JSONify BaseColumnProfiler by @kshitijavis in #769
- Encode categorical column by @kshitijavis in #770
- Encode order column by @kshitijavis in #772
- Add and test JSONify DateTimeColumn by @kshitijavis in #774
- fix: update ml reqs by @JGSweets in #777
- Update to version 0.8.9 by @taylorfturner in #778
New Contributors
- @kshitijavis made their first contribution in #766
Full Changelog: 0.8.8...0.8.9
0.8.8
Profiler
- Quick chi2 test fix #763
Documentation
Other Changes
- Update to version 0.8.8 #764
- PyPi image rendering issue #761
- [BUG] update isort version pin #760
- [BUG] isort version change #759
Full Changelog: 0.8.7.post1...0.8.8
What's Changed
- [BUG] isort version change by @micdavis in #759
- [BUG] update isort version pin by @taylorfturner in #760
- PyPi image rendering issue by @taylorfturner in #761
- Quick chi2 test fix by @taylorfturner in #763
- Update to version 0.8.8 by @micdavis in #764
Full Changelog: 0.8.7.post1...0.8.8