Changes
- Pin numpy to versions < 2 - numpy/numpy#26710
Changes Dependabot updates to:
python-dotenv pipelinewise-singer-python pytest boto3 Updates to code to pass pylint validation
Changes Update to use snowflake-connector-python 3.5.0 which in turn updates the following upstream packages:
pyarrow cryptography to resolve security issues (CVE-2023-38325) cryptography>=41.0.3 to resolve security issues 2023-11-25 (CVE-2023-47248) pyarrow>=14.0.1
Updating to use snowflake 3.5.0 uplifts these packages to:
cryptography>=3.1.0,<42.0.0 pyarrow (no version specified - so latest - at the time of writing this is 14.0.1)
Changes
- Set
pandas.DataFrame
optiondtype=object
to prevent loss of precision with large integers (Fixes #404)
Changes
- Allow retention period in days to be optionally set in the config using the
retention
parameter
Changes
- Update snowflake-connector-python[pandas] requirement from ==2.7.* to >=2.7,<3.1 (Dependabot Alert CVE ID: CVE-2022-42965)
Changes
- Initial release on mjsqu fork
- Pre-tagging before Dependabot PR for snowflake-connector-python
Changes
- Revert use of
ujson
Changes
- Use
usjon
for JSON encoding/decoding
Fixes
- Only drop pk constraint if table has one
- Don't raise
PrimaryKeyNotFoundException
when a record has a flasy pk value
Fixes
- Respecting
flush_all_streams
when SCHEMA messages arrive. - Improve logging for failed merge & copy queries.
- Drop NOT NULL constraint from primary key columns.
- Update PK constraints according to changes to SCHEMA's key properties.
Changes
- Dropping support for Python 3.6
- Adding support for Python 3.9
- Bump pytest to
7.1.1
- Bump boto3 to
1.21
Added
- Support parallelism for table stages
Fixes
- Emit last encountered state message if there are no records.
Changes
- Migrate CI to github actions
- Bump dependencies
- Increase
max_records
when selecting columns by an order of magnitude - Bumping dependencies
- Add support for
date
property format - Stop logging record when error happens
- Fixed an issue with S3 metadata required for decryption not being included in archived load files.
- Add
archive_load_files
parameter to optionally archive load files on S3 - Bumping dependencies
- Add optional
batch_wait_limit_seconds
parameter - Bumping dependencies
- Fixed an issue when
SHOW FILE FORMATS
ran too many times slowing down the startup time of the target - Bump
snowflake-connectory-python
from2.3.10
to2.4.1
- Bump
numpy
from<1.20.0
to<1.21.0
- Add parquet support
- Add check and few logs in the date parsing routine
- Bumping dependencies
- Update caching mechanism to fix issue with badly ordered queryies in a transaction
- Introduced a reserved named parameter for prepared statements.
- Do not use parallel file upload with PUT command and table stages.
- Bumping dependencies
- Add
{{database}}
token toquery_tag
parameter - Use Jinja style
query_tag
template variables
- Fixed a dependency issue
- Add everything from the unreleased
1.9.0
- Use snowflake table stages by default to load data into tables
- Add optional
query_tag
parameter - Add optional
role
parameter to use custom roles - Fixed an issue when generated file names were not compatible with windows
- Bump
joblib
to0.16.0
to be python 3.8 compatible - Bump
snowflake-connectory-python
to2.3.6
- Bump
boto3
to1.16.20
- Fixed an issue when
pipelinewise-target-snowflake
failed whenQUOTED_IDENTIFIERS_IGNORE_CASE
snowflake parameter set to true - Add
aws_profile
option to support Profile based authentication to S3 - Add option to authenticate to S3 using
AWS_PROFILE
,AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
andAWS_SESSION_TOKEN
environment variables - Add
s3_endpoint_url
ands3_region_name
options to support non-native S3 accounts - Flush stream only if the new schema is not the same as the previous one
- Add
s3_acl
option to support ACL for S3 upload - Fixed an issue when no primary key error logged as
INFO
and not asERROR
- Fixed an issue when new columns sometimes not added to target table
- Fixed an issue when the query runner returned incorrect value when multiple queries running in one transaction
- Switch jsonschema to use Draft7Validator
- Fix loading tables with space in the name
- Generate compressed CSV files by default. Optionally can be disabled by the
no_compression
config option
- Log inserts, updates and csv size_bytes in a more consumable format
- Use SHOW SCHEMAS|TABLES|COLUMNS instead of INFORMATION_SCHEMA
- Support usage of reserved words as table names.
- Support custom logging configuration by setting
LOGGING_CONF_FILE
env variable to the absolute path of a .conf file
- Change default /tmp folder for encrypting files
- Make AWS key optional and obtain it secondarily from env vars
- Add temp_dir optional parameter to config
- Fixed issue when JSON value not sent correctly
- Load binary data into Snowflake BINARY data type column
- Add missing module
python-dateutil
- Review dates & timestamps and fix them before insert/update
- Pinned stable version of
urllib3
- Pinned stable version of
botocore
andboto3
- Fixed issue when extracting bookmarks from the state messages sometimes failed
- Bump
snowflake-connector-python
to 2.0.3
- Fixed an issue when number of rows in buckets were not calculated correctly and caused flushing of data at the wrong time with degraded performance
- Fixed an issue when sometimes the last bucket of data was not flushed correctly
- Bump
snowflake-connector-python
to 2.0.1 - Always use secure connection to Snowflake and force auto commit
- Add
flush_all_streams
option - Add
parallelism
option - Add
max_parallelism
option
- Emit new state message as soon as data flushed to Snowflake
- Log SQLs only in debug mode
- Further improvements in
information_schema.tables
caching
- Improved and optimised
information_schema.tables
caching
- Caching
information_schema.tables
to avoid long running SQLs in snowflake - Instead of DROPPING exiting column RENAME it
- Add
data_flattening_max_level
option
- Optimised queries to
information_schema.tables
- Create
_sdc_deleted_at
asVARCHAR
to avoid issues caused by invalid formatted date-times received from taps
- Manage only three metadata columns:
_sdc_extracted_at
,_sdc_batched_at
and_sdc_deleted_at
- Initial release