sampling is in accordance with certain randomization method, divide the line flow. This method refers to sampling can be divided, can also refer to a subset of the flow. Sampling is a small flow of special requirements, division of flow must guarantee the uniformity and randomness, and can filter out the demand that do not conform to the specifications according to the part, we put the sampling process is divided into two steps, screening flow and traffic flow segmentation, segmentation is to full flow evenly scattered, extraction the proportion of traffic flow is fixed, the flow of the screening aided segmentation, screening process is from the segmentation of good flow filters do not conform to the specifications, this paper is mainly concerned with the realization of flow segmentation.
do the commonly used method is single flow flow segmentation segmentation, segmentation of traffic in a certain way, namely flow segmentation. For example, we can according to the basis, the flow of cookie is broken up, or randomly scattered, scattered in different ways, the complete segmentation object is different, if we break up on the basis of cookie then, we complete the segmentation of the object is all cookie, if it is randomly scattered, then the object is the complete segmentation of our site all traffic.
site is new features or new strategy after the station, to evaluate new features or new strategies before the full flow line, the evaluation methods used A-B test, it is a small sampling two traffic in total, respectively take new strategy and old strategy branch branch, through the index difference comparison the two flow conditions, we can evaluate the new strategy, and then decide whether the new strategy of full flow.
single layer flow segmentation architecture diagram
1. introduces the background of
The above mentioned
2. single layer flow segmentation architecture
with the above idea, how do we achieve single flow segmentation? As shown in Figure 1.1, we follow the flow slit mode specified, the input parameters will need to go through a hash calculation, uniformity and randomness produce results is ensured by hash algorithm, the hash result process flow segmentation is not over, we still need to complete hash results corresponding to the segmentation object, implementation method is regarded as a complete object segmentation interval, then the results of hash correspond to the interval above, the interval size is the minimum size of the segmentation decision, for example, if you need the minimum size for segmentation 0.01%, interval we selected for , the definition of interval, we can use the hash results of a numerical model, which is equal to the maximum interval 1, modulo after the results can be mapped to the complete object segmentation interval above only, so that we will all flow play >