Facebook and Kaggle are launching a machine learning engineering competition for 2016.
Facebook和Kaggle正在推出2016年的机器学习工程竞赛。
Trail blaze your way to the top of the leaderboard to earn an opportunity at interviewing for one of the 10+ open roles as a software engineer, working on world class machine learning problems.
开拓者通过自己的方式进入排行榜的顶端,为10名作为软件工程师的开放角色中的一位获得面试机会,从而解决世界级的机器学习问题。
The goal of this competition is to predict which place a person would like to check in to.
本次比赛的目的是预测一个人想要登记的地方。
For the purposes of this competition, Facebook created an artificial world consisting of more than 100,000 places located in a 10 km by 10 km square.
为了本次比赛的目的,Facebook创建了一个人工世界,其中包括10多公里10平方公里的100,000多个地方。
For a given set of coordinates, your task is to return a ranked list of the most likely places.
对于给定的坐标集,您的任务是返回最可能位置的排名列表。
Data was fabricated to resemble location signals coming from mobile devices, giving you a flavor of what it takes to work with real data complicated by inaccurate and noisy values.
数据被制作成类似于来自移动设备的位置信号,让您了解如何处理由不准确和嘈杂的值导致的实际数据。
Inconsistent and erroneous location data can disrupt experience for services like Facebook Check In.
不一致和错误的位置数据可能会破坏Facebook Check In等服务的体验。
We highly encourage competitors to be active on Kaggle Scripts.
我们强烈鼓励竞争对手积极参与Kaggle Scripts。
Your work there will be thoughtfully included in the decision making process.
您在那里的工作将被认真地包含在决策过程中。
Please note: You must compete as an individual in recruiting competitions.
请注意:您必须在招募比赛中作为个人参加比赛。
You may only use the data provided to make your predictions.
您只能使用提供的数据进行预测。
In this competition, you are going to predict which business a user is checking into based on their location, accuracy, and timestamp.
在本次竞赛中,您将根据用户的位置,准确性和时间戳预测用户正在检查的业务。
The train and test dataset are split based on time, and the public/private leaderboard in the test data are split randomly.
训练和测试数据集根据时间进行划分,测试数据中的公共/私人排行榜随机拆分。
There is no concept of a person in this dataset.
此数据集中没有人的概念。
All the row_id’s are events, not people.
所有row_id都是事件,而不是人。
Note: Some of the columns, such as time and accuracy, are intentionally left vague in their definitions.
注意:某些列(例如时间和准确性)在其定义中有意留下含糊不清的内容。
Please consider them as part of the challenge.
请将它们视为挑战的一部分。
文件说明 train.csv, test.csv row_id: id of the check-in event row_id:签入事件的id x y: coordinates xy:坐标 accuracy: location accuracy 准确度:定位精度 time: timestamp 时间:时间戳 place_id: id of the business, this is the target you are predicting place_id:业务的ID,这是您预测的目标 sample_submission.csv - a sample submission file in the correct format with random predictions sample_submission.csv - 具有随机预测的正确格式的样本提交文件
数据集下载 分析特征值:x,y坐标,定位准确性,时间戳。 目标值:入住位置的id。 处理:
0关注打赏