The main goal of this project is to predict whether an AD will be clicked or not based on the given AD information and user information.
If the AD has a high probability of being clicked, show it. If the probability is low, don't show it.
Because if the AD is not clicked, neither side (advertiser, APP) will benefit. So predicting that probability is very important, and that's the goal of this project.
In this project, I will complete the following tasks:
1. Read and understand data: read the given. CSV file into memory, and perform statistics and visualization in PANDAS for a deeper understanding of the data.
2. Feature construction: Derive some new features from the original features, which is also an important work in the field of machine learning.
3. Transformation of features: characteristics are generally divided into continuous and categorical, we need to do different treatment respectively.
4. Feature selection: Select appropriate features from existing features, which is also an essential part of many projects.
5. Model training and evaluation: The model is trained through cross-validation, which involves grid search and other technologies