How to calculate the user's preference tag

1. User portrait—calculate user preference tags

The following describes how to calculate the user's preference tag.

In the last article "User Portrait-Tagging User Behavior", which wrote about user portraits, it mainly talked about how to record each user's operation behavior and business behavior with corresponding tags. In this blog, I will mainly talk about how to calculate these detailed labels and the categories of preferred products and content.

Regarding the calculation of user tag weight, I have talked about it in this article:

User portrait label weight algorithm

Here is a detailed introduction:

User tag weight = behavior type weight × time decay × user behavior times × TF-IDF calculated tag weight

The definition of each parameter in the formula is as follows:

Behavior type weight: Different behaviors such as user browsing, searching, collecting, placing an order, and purchasing have different importance to users. Generally speaking, the higher the complexity of the operation, the greater the weight of the behavior. The weight value is generally given subjectively by operators or data analysts;

Time decay: Some user behaviors are constantly weakened by the influence of time. The farther the behavior time is from now, the less significant the behavior is for the user at present;

Number of actions: The weight of a user's label is counted by day. The more the number of actions a user has with the label on a certain day, the greater the impact of the label on the user;

TF-IDF calculates the weight of tags: the product of the importance of each tag to the user and the importance of the tag in the entire tag obtains the objective weight value of each tag;

In order to calculate the user preference label, it is necessary to calculate the weight value corresponding to the user behavior label on the basis of the user behavior label, and then aggregate the weight of the similar labels to calculate the user preference label. How to mark user behavior tags is introduced in this blog

User portrait—tag user behavior

The following describes how to process user preference tags on the basis of the user behavior tag table:

1. User tag weight table structure design

Field definition:

User id (user_id): the unique id of the user;

Tag id (tag_id): book id;

Tag name (tag_name): the name of the book;

The number of user behaviors (cnt): the number of times the user generates the tag on the same day. If the user browses a book 4 times on the same day, record 4;

Behavior date (date_id): the corresponding date when the label was generated;

Tag type (tag_type_id): In this case, by associating with the book type table, the type corresponding to each book is retrieved, for example, "How Steel is Made" corresponds to "Masterpiece";

User behavior type (act_type_id): refers to the user's purchase, browsing, comment and other operational behaviors. In this example, the default value 1~7 is used to define the user's corresponding behavior type. 1: purchase behavior, 2: browse behavior, 3: comment behavior, 4: favorite behavior, 5: cancel favorite behavior, 6: add shopping cart behavior, 7: search behavior;

2. Processing the weight table on the basis of user behavior tags

When processing the label weight table, a weight dimension table needs to be established according to the weights corresponding to different user behaviors:

Insert data into the dimension table:

3. Calculate the weight value by adding up each tag that each user prefers, and sort the weight value in reverse order, taking top N

2. User portrait—data index and table structure design

This article introduces the data indicators that need to be developed in the portrait and the design of the table structure in the development process.

First introduce the data indicators of portrait development. The general indicator system in the portrait development process includes user attributes, user behavior tags, user active time periods, user spending power, user preferences, etc.

Data index system

User attribute indicators

User attribute indicators describe user basic attributes as comprehensively as possible based on business data sources, and these basic attribute values ​​will not change in the short term. Such as age, gender, mobile phone number attribution, ID card attribution, etc.

User login activity indicators

Look at the user's recent login time period, login duration, login frequency, frequent login location and other indicators

User spending power indicator

Look at the user's consumption amount, consumption frequency, and recent consumption time. Further combining with the active status of user login, users can be hierarchized with RFM.

User Churn Level

Determine the user's intention to churn based on the user's activity and consumption. Can timely conduct marketing recalls for users with a tendency to churn

User age division

When doing marketing activities or pushing on the site, you can perform targeted operations for different age groups

User behavior label

Record every operation behavior of the user on the platform, and the label brought by that behavior. Follow-up can calculate the user's preference label according to the user's behavior label, and do recommendations and marketing activities

Table structure design

For the storage of portrait data, except for user attributes, which will basically not change in the short term, the update frequency of other related data is generally higher, which is weekly or daily update.

The portrait data is updated frequently, and partitions are usually used to physically move the data to the place closest to the user.

Generally, date fields are partitioned. Of course, partitioning is to optimize query performance. Otherwise, users who use data do not need to pay attention to whether these fields are partitioned.

For example, create a user behavior label table:

CREATE TABLE userprofile( user_id string, tag_id string, tag_name string, cnt string, act_type_id string, tag_type_id string) PARTITION BY (date_id string);

The partition table changes the way Hive stores data. If there is no partition, the created table directory is:

hdfs://master_server/user/hive/warehouse/userprofile

After the date partition is created, Hive can better reflect the subdirectories of the partition structure:

hdfs://master_server/user/hive/warehouse/userprofile/date_id='2018-05-01'

Below the userprofile table, the data of each date partition can store the full amount of historical data up to the current day, which is convenient for users to find.

Differential Pressure Sensor

Differential pressure sensor (DPS) is a sensor used to measure the difference between two pressures. It is usually used to measure the pressure difference between the front and rear ends of a certain equipment or component.

Looseness often occurs during installation. The transmitter is connected with the three valve manifold. The bolts should be tightened diagonally. Generally, they cannot be locked at one time. The sealing ring should be densified during the installation of the three valve manifold.

Differential Pressure Sensor,Differential Pressure Transmitter Sensor,Differential Air Pressure Sensor,Adjustable Differential Pressure Sensor

Taizhou Jiabo Instrument Technology Co., Ltd. , https://www.taizhoujbcbyq.com