Innovative Linkedin's Deep-leaning based CTR Modeling: The Deep, Wide, and Shallow Towers Explained

DataTrek

22 Oct 202317:30

Summary

TLDRThis video delves into a deep learning-based Click-Through Rate (CTR) prediction model used in LinkedIn's advertising system. The model uses a three-tower architecture: the Deep Tower for feature interaction, the Wide Tower for memorizing ad IDs and ensuring freshness, and the Shallow Tower for calibrating CTR predictions. By combining these elements, the model optimizes CTR accuracy, improving ROI for advertisers. Key points include handling over-prediction, the importance of freshness, and the trade-off between using pre-trained embeddings and learning embeddings during the prediction process. Overall, this model revolutionizes ad targeting through advanced deep learning techniques.

Takeaways

😀 **CTR Prediction is Crucial for Ads**: Click-through rate (CTR) is the key metric that determines how many people click on an ad after seeing it. It's essential for optimizing ad performance and maximizing ROI for advertisers.
😀 **Three-Tower Architecture**: The model architecture is composed of three towers: a deep tower for complex feature interactions, a wide tower for memorizing ad ID properties, and a cellot tower for calibration and improving prediction accuracy.
😀 **Deep Tower Focuses on Feature Interactions**: The deep tower uses a multi-layer neural network to process generalization features like user, ad, and contextual data, and learns embeddings optimized for conversion.
😀 **Wide Tower Provides Freshness**: The wide tower processes sparse ID features, memorizing specific ad ID properties and learning new ad IDs every hour to maintain freshness and relevancy in predictions.
😀 **Cellot Tower for Calibration**: The cellot tower introduces a simple linear layer to calibrate CTR predictions, ensuring that the probabilities are more representative of true conversion rates and reducing over-prediction.
😀 **Trade-off Between Embedding Approaches**: Two approaches for handling embeddings were compared: using pre-trained embeddings vs. learning embeddings directly in the deep model. The latter was found to provide better optimization for conversions.
😀 **Exposure Bias in CTR Models**: Exposure bias occurs when only a few ads are shown to users, leading to over-prediction of CTR. Calibration on the current model's exposed data helps mitigate this bias.
😀 **Calibration for Accurate CTR**: Calibration improves the accuracy of CTR predictions by adjusting the predicted probabilities, making them more aligned with actual conversion rates.
😀 **Hourly Updates for Model Freshness**: The wide tower is updated hourly, which allows the model to adapt quickly to new ads and maintain high relevance, ensuring that CTR predictions are based on up-to-date data.
😀 **Reducing Overfitting with the Cellot Tower**: Features prone to overfitting, such as position features, are passed only to the cellot tower, which helps in regularizing the model and improving prediction stability.
😀 **Final Model Conclusion**: The deep, wide, and cellot towers work together to improve CTR predictions by handling feature interactions, memorizing ad-specific data, and providing stable calibration, all of which contribute to better ad performance and higher ROI.