e
q
u
e
s
t
a
d
e
m
o < back
Sub-setting
Sub-setting DEFINITION
Sub-setting is a method of selecting a predefined data sub-set necessary for testing some particular development iteration of the code at specified environment.
DEVELOPMENT LIFECYCLE
Let's consider the data origination and movement in the development lifecycle.
DEVELOPMENT ENVIRONMENT.ENVIRONMENT
developing from scratch
Any development starts with the idea, which translates into code, and this code is tested with some data. Test use cases define which data to use and create. At this stage, it is important to find data sources, and the random data masking components that contain data sets are provide easy means to create data. Such are Address, First and Last Names, Social Security Number, Credit Card, etc., as well is the 'regular expression' component. For an example of creating a set of random credit card numbers, use this video:
Developers could create the rest of the random sets in similar fashion.
The amount of data necessary to do tests in that stage of the development is usually the bare minimum, and is used mainly for functional requirements. Of course, non-functional requirements are taken into account in the architecture, but the time to test them is usually on a bigger and more system-like environment than the developer's sandbox.
This development stage is used to create the first iterationiteration(s) of code.code, before exiting into the production environments.
developing within mature application
Often times, especially in the big organizations, development uses data of existing big systems. Either one needs to develop a new system or a set of new features for existing system, developers would define data necessary for testing as "all the existing master data plus some transactional". They would need existing schema and a way to populate both master and transactional data sets. The easiest way is to move such data from existing production systems, master data repositories, etc. If master data repositories do not exist or are not managed centralized, the way developers would do it often times looks the following:
SELECT * FROM (SELECT ROW_NUMBER() OVER (ORDER BY Country) AS [CountryId], [Country],[CountryCode] FROM (SELECT DISTINCT COUNTRY,[CountryCode] FROM [dbo].[Sales]) as Country) AS Countries
resulting in a sample data set, such as
CountryId Country CountryCode
1 Canada CAN
2 Mexico MEX
3 United States USA
TEST ENVIRONMENTS
When the first iteration of the code is complete, often times the developers push the code to the continuous integration environment and/or to the testers who do testing manually or in automated mode/ In such environments, the amount of test cases increases to test new features in more detail, code regressionregression, and some non-functional requirements such as ease of product use, security, etc. Continuous integration environments are used in the teams withutilizing best engineering practices based on a method offirst mentioned by Grady Booch and later adopted by extreme programming agile methodology.community. QA or testing environments are either used in manual mode or in automated mode and require significantly more data so that all the use cases used in production and that might be affected by the changed functionality are covered. This is the stage when data masking becomes essential, as often times