Jump to content

 
Photo

Geoprocessing Woes

- - - - -

  • Please log in to reply
10 replies to this topic

#1
Martin Gamache

Martin Gamache

    Ultimate Contributor

  • Associate Admin
  • PipPipPipPipPipPip
  • 980 posts
  • Gender:Male
  • Location:Washington DC
  • Interests:History of Topographic Cartography
    Topographic Mapping
    History of Relief Depiction
    Thematic Cartography
    Demographic Cartography
    Cartographic techniques, methods, and tools
    Orienteering
    Panoramic & Kite Photography
  • United States

In my continuing global modelling saga I am running into several issues with both Edit Tools and Arcmap geoprocessing tools.

I am trying to union or merge without overlaps 3 global datasets of Protected Areas and to then intersect these datasets with my global ecoregion data. I get nothing but unexplained errors. Sometimes after 20+ hours of processing.


Anyone have any insight into what may be causing Arc to crash and burn on basic geoprocessing tasks like these. These are big datasets. And if it's a size issue where can I obtain some information on size limitations for these operations.

I've been reluctant to use Manifold to work on this process since my experience in the past is that it is extremely slow for this sort of task if it can even get it done.

M

#2
MapMedia

MapMedia

    Hall of Fame

  • Validated Member
  • PipPipPipPipPipPipPip
  • 1,029 posts
  • Gender:Male
  • Location:Davis, California
  • United States

Without knowing more details its hard to help, other than to offer what I would do in a general situation as yours - switch to AV 3.x. For reasons unexplained, I get faster processing times on geoprocessing tasks than in ArcMap, especially when intersecting large, complex polygon coverages.

For mega coverages, exp global, I would tile them and run separately.

#3
CHART

CHART

    Chart

  • Validated Member
  • PipPipPipPipPip
  • 358 posts
  • No Country Selected

Martin,
Really wish I could help on this one. I am MapInfo user when it comes to geoprocessing and I have used some very large data sets without problems.

Could there be something like a number of vertexes on a polygon limitation with Arc (maybe weeding your line work on your polygons if possible before). Could be at lat long issue.... re-project into a projected data set before running....
Is there a way you could break up the world into smaller parts and re-process. When I run a geo-process on large data sets, I always test on smaller ones... (at least you don't have to wait 20 hours to find out if you did something wrong)
20 hours .... to run (is there some sort of progress bar to help gage things?)

Not much help I know.... but I just had to try.
Chart

#4
Charlie Frye

Charlie Frye

    Master Contributor

  • Validated Member
  • PipPipPipPip
  • 112 posts
  • Gender:Male
  • Location:Redlands, CA
  • Interests:Base map design/data model, political/election maps; use of historical maps for modern GIS analysis
  • United States

M,

Data in the real world meets the unwritten expectations of the geoprocessing programmer. So your expectations are correct--it should run and run faster than you describe. I run into things like this pretty frequently and here's my list of things to try:

1. Repair Geometry tool. The tool will repair:
  • Null geometry?The feature will be deleted from the feature class.
  • Short segment?The geometry's short segment will be deleted.
  • Incorrect ring ordering?The geometry will be updated to have correct ring ordering.
  • Incorrect segment orientation?The geometry will be updated to have correct segment orientation.
  • Self intersections?The geometry's segments that intersect will be split at their intersection.
  • Unclosed rings?The unclosed rings will be closed.
  • Empty parts?The parts that are null or empty will be deleted.
2. Particularly if the data came in as a shapefile, multipart shapes may cause a problem. Thus, use the Multipart to Singlepart tool.

3. There still may be a bad shape somewhere, so try processing on a subset of the data just to verify that the model can run to completion

4. That said, you may be running into a bug. Most bugs in this class of problem can be reduced down to a small scenario with just a few features. If you think that's your case definitely submit it to tech-suport; they may have other things to try that I haven't thought of.

5. And for the record, we've not had a vertex limit for a very long time (10+ years).

Charlie
Charlie Frye
Chief Cartographer
Software Products Department
ESRI, Redlands, California

#5
merft

merft

    Key Contributor

  • Validated Member
  • PipPipPip
  • 86 posts
  • Location:Colorado
  • United States

I have done some pretty intensive models. These are a few things that I have learned.
  • Make sure all your data is in the same projection.
  • Make sure all your data uses simple geometries (multipart to singlepart)
  • Write your output files to a different directory from your source data. Preferably a different drive. The biggest culprit of geoprocessing errors is file/schema locks. Be sure that if you try again that you delete ALL files in this directory and REBOOT.
  • Though ESRI does not have a vertex limit, there are file size limits to the OS. This affects both output files and intermediate files.
My experience has been the leading culprit to Model Builder issues is file/schema locks, especially large files. File size issues come in a distance second. On a project that covered 80,000 mi2, we had over 3 million polygons and were able to process it by splitting the area into 4 quadrants with an overlap. Then merged each of the completed files back into one contiguous dataset. Also, if you are using Model Builder, run small sections at a time rather than running the entire model at once.

I would also recommend trying backwards approach to working with your datasets. I use this when working with large datasets. Select by Location the your ecoregion data that intersects with the Protected Areas. Invert your selection to remove ecoregions that do not contain Protected Areas. This will reduce the number of features that need to be processed. This sometimes helps.

Sometimes there is a problem with shapefiles. I would recommend "washing" the files through a 3rd-party application. Convert the shapefile to another format, then back to shapefile. I tend to use Global Mapper for this.

Good Luck,
-Tom

#6
Martin Gamache

Martin Gamache

    Ultimate Contributor

  • Associate Admin
  • PipPipPipPipPipPip
  • 980 posts
  • Gender:Male
  • Location:Washington DC
  • Interests:History of Topographic Cartography
    Topographic Mapping
    History of Relief Depiction
    Thematic Cartography
    Demographic Cartography
    Cartographic techniques, methods, and tools
    Orienteering
    Panoramic & Kite Photography
  • United States

I've managed to do the tasks I was trying to.

First thing I did was delete all the protected areas file and recreate them from the original zipped downloaded datasets from the source.

I did a clean restart and made sure all the memory was free, i.e. deleting temp files.

This allowed me to do all the merges using ET. Prior to these steps it crashed everytime.

Once I did the merges which took almost 15hrs I was able to process the intersect and dissolve operations, each time having to reboot. Most of these operations did not succeed on the first try, either returning out of memory or topology errors.

m

#7
frax

frax

    Hall of Fame

  • Associate Admin
  • PipPipPipPipPipPipPip
  • 2,295 posts
  • Gender:Male
  • Location:Stockholm, Sweden
  • Interests:music, hiking, friends, nature, photography, traveling. and maps!
  • Sweden

this is part of the reason why I prefer to do things in ArcInfo workstation a lot of times...
* error messages make more sense and are easier to understand
* you don't have to struggle with the UI
* no unexpected oddities with data and temp files...
Hugo Ahlenius
Nordpil - custom maps and GIS
http://nordpil.com/
Twitter

#8
mdsumner

mdsumner

    Key Contributor

  • Validated Member
  • PipPipPip
  • 96 posts
  • Australia

Perhaps you might try the new 7x Manifold:

http://69.17.46.171/...994541106730000

There is some discussion of the improvement in performance there. :lol:

#9
Martin Gamache

Martin Gamache

    Ultimate Contributor

  • Associate Admin
  • PipPipPipPipPipPip
  • 980 posts
  • Gender:Male
  • Location:Washington DC
  • Interests:History of Topographic Cartography
    Topographic Mapping
    History of Relief Depiction
    Thematic Cartography
    Demographic Cartography
    Cartographic techniques, methods, and tools
    Orienteering
    Panoramic & Kite Photography
  • United States

My version of 7...not sure which version gave me a nout of memory error after a couple of hours of processing!

m

#10
mdsumner

mdsumner

    Key Contributor

  • Validated Member
  • PipPipPip
  • 96 posts
  • Australia

7.1.4.575 is the version being discussed. It was released 3 days ago.

If you've already decided it's not for you that's fine, but others might be interested. But, it sounds like your problem would be a great test - you should try it out and send them the data if it "doesn't work".

Here's a summary of the improvements in this release (from the link provided):

"We have improved the performance of the Clip Intersect and Clip Subtract operations for lines, the performance of the Split operation, and the performance of the Topology Overlays tool. The increase in the performance of the Topology Overlays tool is biggest for Union and Update overlays (a factor of 10 or more) and smallest for the Identity and Intersect overlays (a factor of 2 or more). The increase in the performance of other operations is big (a factor of 10 or more). Either way you cut it (pun intended...) the increase in performance is visible and substantial and has made the investment into improved geometry processing very worthwhile.

Plans are to close out any remaining issues not involving the geometry layer, to do full quality assurance and to issue an update as a production release shortly after the 25th of this month. So now is the time for all intrepid experimenters to hammer away at this new release with the toughest geometry processing tasks available. :-)
"

#11
Charlie Frye

Charlie Frye

    Master Contributor

  • Validated Member
  • PipPipPipPip
  • 112 posts
  • Gender:Male
  • Location:Redlands, CA
  • Interests:Base map design/data model, political/election maps; use of historical maps for modern GIS analysis
  • United States

Martin,

I forgot one other ArcGIS/Geoprocessing tidbit. In 9.1, in the Samples toolbox, there are some geoprocessing tools that were designed for handling large data efficiently. At 9.2 we added that logic into the regular version of these tools--so it will be used when its needed. Basically if you're geoprocessing a relatively small number of features, one memory management approach is ideal, but that strategy does not hold up for large datasets, in fact there's a break point at which large data will get much slower. Another way to manage memory allows large datasets to be processed optimally, though it is not as efficient for small datasets.

Also at 9.2 (now shipping) we also introduce a new, additional, DBMS for the geodatabase, called File Geodatabase. This gets us past the limitations of the personal geodatabase, which was dependent on the Microsoft JET database engine. So, no more 2Gb file size limits, update transaction count limits, and no more being somewhat limited to WINTEL platforms. The personal geodatabase file locking issues will go away, but it is not a multi-user dbms, so expect it to behave similarly to shapefiles with respect to locking.

Happy Thanksgiving,

Charlie
Charlie Frye
Chief Cartographer
Software Products Department
ESRI, Redlands, California




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users

-->