[OSM-dev] Overpass API delivers randomly invalid data

Gubler, Ruediger rgubler at init-ka.de
Thu Oct 27 07:20:20 BST 2011


Hi,

we tried to download a huge area which is done with a groovy script. 
The download starts at 25.10.11 15:09:37 GMT+2 and the error was detected at 26.10.11 01:50:28 GMT+2
I can't say which paricular query is failed but i have a node ID wich was wrong: 679111231 
The error is reproducable but it will cause at changing elements (nodes, ways, relations) and different ID's. 
There are only a few elements broken. The surrounding elements are OK. 

Here is the download part of our sript:

MIN_X_Property = 8.059576
MIN_Y_Property = 46.741588
MAX_X_Property = 14.670413
MAX_Y_Property = 51.430882
CHUNK_SIZE = 0.05
PrintWriter pw = new PrintWriter(raw_unsortedFile, "UTF-8")
pw.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>")
pw.println("<osm version=\"0.6\">")
processErrorStatus = createJobsAndDownloadData(MIN_X_Property, MAX_X_Property, MIN_Y_Property, MAX_Y_Property, CHUNK_SIZE, pw, Master_Logfile)
pw.println("</osm>")
pw.close()

def int createJobsAndDownloadData(BigDecimal min_x, BigDecimal max_x, BigDecimal min_y, BigDecimal max_y, BigDecimal chunk_size, PrintWriter destinationPrintWriter, File masterLogFile)
{
  // Define a closure to log to the console as well as in our logfile
  def logALine = { String logMessage ->
    println(logMessage)
    masterLogFile.append(logMessage + "\r\n")
  }
  
  def int processErrorStatus = 0;

  // let's start
  double job_min_x = min_x
  double job_max_x = min_x + chunk_size
  boolean break_x = false
  //long count_x = 0L
  while(job_max_x <= max_x + chunk_size && !break_x)
  {
    if(job_max_x > max_x)
    {
      job_max_x = max_x
      break_x = true
    }
    double job_min_y = min_y
    double job_max_y = min_y + chunk_size
    boolean break_y = false
    //long count_y = 0L
    while(job_max_y <= max_y + chunk_size && !break_y)
    {
      if(job_max_y > max_y)
      {
        job_max_y = max_y
        break_y = true
      }

      URL api_url = null
      try {
        api_url = new URL("http://www.overpass-api.de/api/xapi?*[bbox=" + job_min_x + "," + job_min_y + "," + job_max_x + "," + job_max_y + "][@meta]");
        logALine("Calling: " + api_url.toString())
        URLConnection yc = api_url.openConnection()
        BufferedReader br = new BufferedReader(new InputStreamReader(yc.getInputStream(), "UTF-8"))
        String inputLine;
        while ((inputLine = br.readLine()) != null)
          if(!inputLine.contains("<osm"))
            if(!inputLine.contains("</osm"))
              if(!inputLine.contains("<?xml"))
                if(!inputLine.contains("<bound"))
                  if(!inputLine.contains("<note"))
                    if(!inputLine.contains("<meta"))
                      destinationPrintWriter.println(inputLine)
        br.close()

        // wait to jolly OSM server along
        Thread.sleep(1000)
      }
      catch(e) {
        processErrorStatus = -1
        logALine ("Error at URL: " + api_url.toString())
      }
      if(processErrorStatus != 0)
      {
        return processErrorStatus
      }

      //count_y++
      job_min_y = job_max_y
      job_max_y = job_max_y + chunk_size
    }
    //count_x++
    job_min_x = job_max_x
    job_max_x = job_max_x + chunk_size
  }
  
  return processErrorStatus
}


Yours Rüdiger



-----Ursprüngliche Nachricht-----
Von: Roland Olbricht [mailto:roland.olbricht at gmx.de] 
Gesendet: Mittwoch, 26. Oktober 2011 19:20
An: dev at openstreetmap.org
Cc: Gubler, Ruediger
Betreff: Re: [OSM-dev] Overpass API delivers randomly invalid data

Hi,

> We are querying data for a bounding box and randomly entries are delivered
>  as 
>           <way id="30134048">
> or
>           <node id="31881194" lat="47.5137295" lon="11.3250178"/>
>  
> The entries are cut before the version attribute. If we query the entries
>  with the id all attributes are delivered. 

I've tried to reproduce this. But all changesets I have receieved have 
complete meta data.

So could you please give more detailed information, in particular

- At what time as exactly as possible have you tried to download? There have 
been an outage on Sunday including a backlog for meta data until Monday 
evening, and I would need to figure out whether it is related to this or not.

- Which query exactly failed?
- Does is fail sometimes or reproducably?
- Are in a damaged response no meta data at all or is it partly missing?

Best regards,

Roland



More information about the dev mailing list