Ektron Reference |
Crawling is Microsoft Search Server's process that prepares the index files that enable searching. See Also: http://technet.microsoft.com/en-us/library/cc280343(office.12).aspx#section1
When Ektron content is added, deleted, or updated, a crawl must occur in order for that content to be available (or no longer available) to the search. In general, Ektron manages the crawl automatically -- you don't need to do anything.
This section explains the automatic crawl: what starts one, how to monitor its status, etc. In addition, if your search has a problem, you can run a manual crawl to troubleshoot it.
This section also contains theses topics.
Setting the Incremental Crawl Interval
What Happens if a Crawl is Running When a New One is Scheduled to Start
Screens to Monitor and Manage Crawls
Using Crawl Filters to Improve Search Performance
Ektron supports two types of crawls.
Crawl Type | When begun | Changes that Trigger | More information |
---|---|---|---|
Full | Immediately after the triggering event | Events that significantly change data structure | Events that Start a Full Crawl |
Incremental |
After incremental crawl interval passes |
Less significant events that still require a crawl. Ektron batches changes and triggers an incremental crawl at specified intervals, such as every 10 minutes. Incremental crawls enhance your Search Server's performance by using fewer resources. |
Events that Start an Incremental Crawl |
These events, which significantly change data structure, trigger a full crawl when they occur. This crawl registers searchable CMS properties with Search Server, and ensures that search results reflect the latest information for all CMS content.
A site is registered with Search Server. For example, a new site is installed.
Ektron registers new searchable properties with Search Server. For example, a new metadata definition, a new Smart Form configuration, etc.
NOTE: Two full crawls are run whenever new properties are mapped (for example, a Smart Form is added).
After you synchronize your sites with eSync, a crawl updates them to reflect the changed content.
- A database sync starts a full crawl.
- A content or folder sync starts an incremental crawl.
If any type of aliasing is enabled or disabled through the Workarea's Settings > Configuration > Url Aliasing > Settings page.
Except for those listed in Events that Start a Full Crawl , any event that updates the database is queued for an incremental crawl to be run when the specified time interval passes. See Also: Setting the Incremental Crawl Interval
An incremental crawl looks for content that was added, deleted, or updated since the last crawl. Here are examples of such events.
creating content
NOTE: If a user begins to edit content then cancels, an incremental crawl is queued.
editing existing content's text, images, or properties
creating a new menu, collection, or taxonomy category
creating a new user or updating an existing one
Use the Search Configuration Screen Site Registration panel's Interval field to define the incremental crawl interval in seconds. See Also: Site Registration Panel of the Search Configuration Screen
Whenever a crawl finishes, Ektron begins to track the time. After the specified number of seconds expires, Ektron checks for changes to the database. If any occurred, a new incremental crawl starts. If none occurred, the timer is reset.
You should not need to start a crawl. Ektron initiates crawls as necessary. You would typically begin a manual crawl for troubleshooting purposes.
See Also:
Starting an Incremental Crawl Immediately
Starting a Full Crawl Immediately
To start an incremental crawl immediately, go to Workarea > Settings > Configuration > Search > Status. Click Request Incremental Crawl.
To start a full crawl immediately, go to Workarea > Settings > Configuration > Search > Status. Click Request Full Crawl.
If any type of crawl request is issued while a crawl is running, the new crawl will start only after the current crawl completes.
A pending full crawl starts before any pending incremental crawls.
Ektron provides two screens and a log that enable you to monitor and manage crawls.
Workarea’s Search Status screen See Also: Search Status Screen
Search Configuration Screen's Crawl Management panel See Also: Ektron's Search Server Configuration Screen
The following table compares the screens.
Options | Workarea's Search Status screen | Search Configuration Screen |
---|---|---|
How to access | Ektron Workarea > Settings > Configuration > Search > Status | On server that hosts Search Server: Windows Start > All Programs > Ektron > CMS400vreleasenumber > Utilities > Search Config |
View crawl information |
search server content source name query credentials if there is a pending request to begin incremental search current and next scheduled search action See Also: Current and Next Actions most recent start and end times duration (last if no crawl currently running; current if crawl currently running) interval See Also: Setting the Incremental Crawl Interval crawl filters |
On Site Registration panel Crawl IntervalSee Also: Setting the Incremental Crawl Interval Crawl Filters Crawl Tracing Level
On Crawl Management panel content source name crawl status (same as Current Action on Search Status screen) most recent start and end times duration |
Start full crawl See Also: Starting a Full Crawl Immediately |
Not available |
|
Start incremental crawl See Also: Starting an Incremental Crawl Immediately |
Not available |
|
Set incremental crawl interval |
View only |
Site Registration panel > Crawl Interval fieldSee Also: Setting the Incremental Crawl Interval |
Set the type of data that is searched | View only |
Site Registration panel > Advanced Options > Crawl Filters |
The Data Directory stores a log of information about each crawl. The logs are stored in the Data Directory. On the Search Configuration screen's Crawl Tracing field lets you determine the amount of detail you want the log to collect.
The following table lists errors that may appears during a crawl, and how to resolve them. Errors appear in Microsoft Search Server 2010's administration portal. They do not appear in Ektron.
Error |
Problem |
Solution |
---|---|---|
Exception from HRESULT: 0xC00CEE2D |
A content block based on a Smart Form has a blank content_html field. |
Remove the field or insert content into it. |
The following table lists errors that may appears in the crawl log.
Error |
Problem |
<url_path_to_asset> The filtering was stopped because of a user action, such as stopping the crawl |
An asset is referenced in the Ektron database, but the physical file was deleted from the file system. |
Ektron Version 8.5, Doc. Rev. 2.0 (Dec. 2011)
Visit the Ektron Dev Center at http://dev.ektron.com 1–866–4–EKTRON
Ektron Documentation, © 2011 Ektron, Inc.