This week at Dozer #3

原文:https://getdozer.io/blog/2023/03/24/this-week-3


欢迎关注 Dozer 本周的动态! 我们很高兴与您分享我们取得的最新进展。 本周更新如下。

Release v.0.1.13

Dozer v.0.1.13 is avaiable. Checkout the release notes here.

Insert and Update Conflict Resolution #1267

Now Dozer supports conflict resolution while writing data to the sinks. Depending on the type of data, developers can control app behavior. If consistency and accuracy is far more important vs speed and estimates.

endpoints:
 - name: data_api
 conflict_resolution:
#     Options: nothing | update | panic
 on_insert: update 
#     Options: nothing | upsert | panic
 on_update: upsert      
#     Options: nothing | panic
 on_delete: nothing

Parallelized Joins #1180

Performance improvement in the order of 4x to 5x.

We have simplified and optimized Join implementation which resulted in a significant peformance boost. In case of a single source of the query the Processor is simply bypassed, since any operation on the record is necessary on this case. In case of one or more JOIN operators in the SQL one Product Processor for each join is created and connected.

Eg:

SELECT  name, department.name as dep, salary
FROM user 
JOIN department ON user.department_id = department.id 
JOIN country ON user.country_id = country.id;

This query is converted to a pipeline:


Testing Strategy

Our focus has been introducing a number of test cases to increase the stability of Dozer.

Data type tests for connectors

Populate an external data source with all possible data types the connector supports, Dozer will automatically check if all conversion works without bug. Put the data populating code in DataReadyConnectorTest::new and you are done!

pub trait DataReadyConnectorTest: Send + Sized + 'static {
 type Connector: Connector;

 fn new() -> (Self, Self::Connector);
}

For example, local storage connector implements it like this:

pub struct LocalStorageObjectStoreConnectorTest {
    _temp_dir: TempDir,
}

impl DataReadyConnectorTest for LocalStorageObjectStoreConnectorTest {
 type Connector = ObjectStoreConnector<LocalStorage>;

 fn new() -> (Self, Self::Connector) {
 let record_batch = record_batch_with_all_supported_data_types();
 let (temp_dir, connector) = create_connector("sample".to_string(), &record_batch);
 (
 Self {
                _temp_dir: temp_dir,
 },
            connector,
 )
 }
}

Ingestion tests for connectors

Test if a connector ingests data as expected. Implement InsertOnlyConnectorTest (optionally CudConnectorTest) to test the most common connector methods used in Dozer. The test suite simulates a full run of Dozer to make sure the tested connector ingests and outputs data correctly.

For example, PostgresConnectorTest implmenets CudConnectorTest by executing sql against the postgres database.

impl CudConnectorTest for PostgresConnectorTest {
 fn start_cud(&self, operations: Vec<Operation>) {
 ...
 std::thread::spawn(move || {
 for operation in operations {
                client
 .batch_execute(&operation_to_sql(
                        schema_name.as_deref(),
 &table_name,
 &operation,
 &schema,
 ))
 .unwrap();
 }
 });
 }
}

As long as a connector passes this test suite, Dozer can guarantee data integrity using that connector. Local storage and postgres connector have passed the test.

Integration Tests for Dozer Samples

We've added an integration test for each of the samples, so they won't break unexpectedly! Sql integration tests #1282

Prop Tests

We have complemented our unit tests with a range of prop tests. Read more about prop tests here

We have included various data type tests using the following approach 1245

 proptest!(ProptestConfig::with_cases(1000), |(a in ".*", b in ".*")| {
 // Tests 
 });

Other Improvements & Fixes

Local Storage test #1290

This PR adds the necessary mechanism for setting up a local storage connector in e2e tests, and adds a new e2e test according to dozer-samples.

DataReadyConnectorTest#1296

Postgres test#1299Graceful Handling of grpc API errors#1289Add ny taxi sample to e2e test#1263Add postgres connector sample to e2e tests#1278

Changelog

https://github.com/getdozer/dozer/compare/v0.1.12...v0.1.13

Contact us

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容