· 6 years ago · Mar 16, 2020, 09:14 PM
1#Open Play Test
2
3##Coding task – data cleansing filter system When signing up new clients we will need to import existing customers into our database, however a lot of the customer data coming in will be incorrect/dirty data such as incorrect address, missing fields etc.
4
5As such alongside importing the data we wish to apply some flagging of dirty data fields which can then be cleaned in the future (e.g prompt the customer to update their phone number next time they login).
6
7We require you to create a flexible system to implement this cleansing. Implementation should support the following
8
9• Ability to flag when a field is dirty/incorrect and where possible suggest an appropriate correction e.g phone format is 777878789, suggested format should be 07778 78789 • Modular so different cleansing filters can be added as and when, filters might be title,name,address, country, date of birth, telephone format etc • Ability to set a dirtiness score/rating against each filter whenever a record fails the filter. This can then power the overall "dirtiness score" of the record. e.g if record fails telephone then dirtiness score is 10, if it fails telephone and address then dirtiness score is 30 • Some suggestions such as address may require calling third party services via API for validation Some overall guidance on what is expected • You do not need to worry about the import of data into the system via csv etc, you can assume its in an array/laravel collection etc • Not all filters mentioned above need to be implemented, just enough to show the task has been understood • No cleaning needs to occur, only flagging of dirty data and storing of suggested correct version • Suitable tests to show filters working • Code to be supplied either via zip, GH repo etc
10
11##Code description Since the task is to validate already imported models (not a form as is usual), I've decided to use Laravels validations combined with Trait to add the missing functionality (dirty score and suggestions where possible). This allows us to use this functionality in any other imported data set by simply adding the use DataCleanupTrait; statement.
12
13"Dirty Score" and suggestions will be stored in a separate table (data_cleanup) which will store the referenced table, key, dirty score as an integer and a json object with error messages and suggestions if any.
14
15##Relevant Files
16
17app/Models/*.php
18app/Rules/*.php
19app/Traits/DataCleanupTrait.php
20config/userValidation.php
21database/factories/UserFactory.php
22database/migrations/*.php
23tests/*
24.env / .env.testing
25##Validators Given this is a test to evaluate different cleansing mechanisms, I've decided to implement a few different approaches:
26
27Using Laravel's native validators: Naturally, if using a framework with a specific functionality which matches our requirements, we should use the frameworks tools when possible. I've used the string and alpha validators for the first and last name fields in the User object.
28
29Adding a custom rule in Laravel validation : Native validators may not do everything we want them to do (validate a phone number for example), but we can extend it and still be able to use it anywhere, hence the convenience of extending. This method also allows use more control over the error message and usage of external APIs. This method has been used for the address field (app/rules/AddressApi), UKMobile and UKLandLine numbers.
30
31The address api rule uses google maps geo location API to try to get an exact address match. If there isn't an exact match it'll return a set of possible addresses in the error message. If there is no match, there will be no suggestions naturally.
32
33Since one of the great things about laravel is it's community, the Uk Mobile and Land line validators use propaganistas/laravel-phone package to attempt to match the sent format to the expected one. If there's no exact match, the value returned by the package will be suggested.
34
35##Validation Rules Since there weren't many rules for the imported users in the task description, I've only implemented the following:
36
37First And Last Names: should be alphabetical.
38Phone numbers: should match formatted data from propaganistas/laravel-phone package
39Addresses: Should match returned value from google maps goe location API.
40Naturally, in a production environment this would require a lot more thought and business/domain knowledge and quite possible more rules for each field.
41
42##Installation This is a very simple Laravel application, however, a Google Maps API Key is required and has been provided in the .env.example and .env.testing files.
43
44Validator config (points) has been stored in 'config/userValidation.php' , while validation rules have been stored in the User model as $rules.
45
46To install the project, please update the chosen .env with your desired db config and run php artisan migrate from the project root.
47
48##Tests Basic unit tests have been written for the DataCleanupTrait. While this is a trait and could be tested using an anonymous class, since we need a test model with rules and data, a UserFactory has been created and the User class will be used to test.
49
50All tests are stored in the tests directory under Feature or Unit depending on the type. Test directories follow the app file structure.
51
52Unit and feature tests can be run by using the vendor/bin/phpunit command from the project root.