00. An investigation of Lost location Issue

It will take about 5 minutes to finish reading this article.

1. Background

This is really one of the issues that impressed me. I’m recording it here specifically because I find it quite fascinating.

At that time I was a leader of iOS team based on Oversea business at Meituan. The business was brand new, I was one of the few guys who were familiar all Oversea business. One day, Our data engineer came to me, and told me that the location data of our business had a high lost rate to 30%, even to 50% in some special areas. According to past experience, it may be normal for this loss rate to be between 10% and 15%, so it was too high.

As we all kown, without the correct location information(City or region information), we could not offer different business data to the accurate areas we wanted. Soon I realized it was a serious problem for our business, and I reported it to our technical boss. We decided that until the problem was resolved, the users that lost location information could only read the default H5 page we offered.

2. Investigation process

Then I began to investigate the issue and tried to optimize it. Later, I gradually discovered that the issues involved in this process are very complicated, involving many links and requiring a lot of cross-team cooperation.

The process of getting location data in iOS is as follows:

(1) iOS gets the longitude and latitude by calling the system callback function in CLLocationManagerDelegate.

1
2
3
4
5
6
- (void)locationManager:(CLLocationManager *)manager didUpdateLocations:(NSArray *)locations {
CLLocation *location = [locations firstObject];
CLLocationCoordinate2D coordinate = location.coordinate;
NSLog(@"Longitude: %f, Latitude: %f",coordinate.longitude,coordinate.latitude);

}

At this step, although there could have been failures with a rate of 5% more or less, what we could do was limited, not only because it was a system-level method, but more importantly, this was the code that existed in the basic library. Any modification to it would have affected the entire company’s business.

(2) After getting the latitude and longitude, we used them and some other information (IP address, etc.) to request a back-end interface( we can call it requestPosition), which will return the corresponding location information, which is a basic public interface for all the company’s business.

Then by querying the interface logs and the database logs of other businesses, I found that the failure rate of this interface for overseas business was very high, but the failure rate for domestic business was normal. I don’t know why. After all, all businesses use the same interface.

But even though I didn’t know why, I pushed for something: to create the interface in a way that in overseas business scenarios, if it failed to acquire location information, it would trigger a retry mechanism, meaning that if the first attempt failed, the subsequent attempt would be made up to 2 times until it succeeded(The actual logic may be more complex.).

After doing this optimization, I again verified the location information data for the business and found that the data had improved, but the improvement was limited, about 20% to 25% from the original average of 30% to 40%.

(3) I escalated the issue after that because I felt that it might be beyond our capabilities on the mobile side, and I organized multiple meetings with multiple server-side teams to discuss the issue.

In the discussion with the server side I found:
Because the company’s network services are basically customized http protocol, take the network long connection, with the help of strong domestic server resources, domestic users benefit from this way, the network’s end-to-end success rate is very high (long connection that is, through the heartbeat message way to keep alive).

However, due to the lack of sufficient server resources in foreign countries, coupled with the network conditions in some foreign regions were not very good, at this time, the end-to-end success rate of the network long connection was very low, and may be even lower than the short link method. We originally wanted to change the overseas business network request to short connection, but after discussion, we felt that this is not a fundamental solution.

Later, I learned that through the efforts of our big boss, we were able to help our overseas business deploy a dedicated server resource in Hong Kong to support the overseas business. Later this data was optimized and finally reached the ideal zone.

So far, this problem was solved.

3. Review and summary

In this process, I had to constantly review other teams’ code, and communicated with the basic service team many times on how to optimize the code.

I needed to apply permissions to various databases and learned to write various SQL languages to query data. Query server-side logs, query other business data repeatedly for comparison.

During this period, I also needed to discuss and communicate with multiple server teams and multiple data teams.

Although this work could be beyond the ability and responsibility of an iOS developer, and the process was full of challenges, But Looking back now, I still benefited a lot from it and learned a lot of knowledge beyond iOS.