岗位职责:
Responsibilities:
1.负责数据平台基本数据需求,包括离线/实时数据平台的搭建,数仓模型的设计及ETL相关开发工作;
Responsible for basic data requirements of the data platform, including building offline/real-time data platforms, designing data warehouse models, and ETL development work.
2.协同算法团队,完成数据特征开发适配;
Collaborate with the algorithm team to complete data feature development and adaptation.
3.能利用机器学习,大模型等AI工具进行深入分析数据,并挖掘潜在特征,例如用户画像;以及将挖掘特征应用到推荐以及搜索等相关服务;
Utilize machine learning, big models, and other AI tools to analyze data deeply, extract potential features like user profiles, and apply these features to recommendation and search services.
4.保证数据平台运行稳定性以及数据质量的可靠性,并能针对当前系统存在的问题进行优化;
Ensure the stability of the data platform and the reliability of data quality, and optimize existing system issues.
5.负责大数据平台持续改进优化,能实时跟进新技术在大数据平台的应用,提升数据服务的能力以及效率;
Responsible for continuous improvement and optimization of the big data platform, keep up with new technologies in the big data platform, and enhance data service capabilities and efficiency.
6.制定数据开发规范,编写技术文档并对初级工程师进行技术指导;
Establish data development standards, write technical documents, and provide technical guidance to junior engineers.
任职要求:
Requirements:
1.计算机/数学/统计相关专业本科及以上学历,2年以上大数据领域开发经验;
Bachelor's degree or above in Computer Science/Mathematics/Statistics, with more than 2 years of experience in big data development.
2.掌握大数据生态Spark, Flink, Hive等技能,并能进行复杂的数据处理;
Proficient in big data ecosystems such as Spark, Flink, Hive, and capable of complex data processing.
3.熟练使用Python pandas, 熟悉linux开发环境,能熟练开发自动化任务脚本;
Proficient in Python pandas, familiar with Linux development environment, and able to develop automation scripts proficiently.
4.具备良好的数据分析和逻辑思维能力;
Strong data analysis and logical thinking skills.
5.有良好的Java基础功底,熟悉后端服务相关开发工作;
Good foundation in Java, familiar with backend service development work.
加分项:
Bonus Points:
1.有电商工作经验着优先
Experience in e-commerce is a plus.
2.有参与推荐/搜索/用户画像相关项目经验优先
Experience in recommendation/search/user profile-related projects is preferred.
3.有AWS EMR和Redshift相关云服务实践经验优先
Experience with AWS EMR and Redshift cloud services is a plus.
4.有实时数仓工作经验优先
Experience in real-time data warehousing is preferred.