solr的Mutil Core配置


在Apache.Solr.3.Enterprise.Search.Server中,作者强烈建议使用多内核,并且与其后面的4.0版本多核有可能是默认设置,刚好项目需要用到多核,研究配置成功,其实也很简单。

因为要用到的两个核是不同索引,不同solrconfig和schema,所以要分别把它们的配置、data放到单独目录便于管理。

1. 在solr.home目录下,新建(若有则修改)solr.xml,代码如下:
<?xml version="1.0" encoding="UTF-8" ?>
<!--
 Licensed to the Apache Software Foundation (ASF) under one or more
 contributor license agreements.  See the NOTICE file distributed with
 this work for additional information regarding copyright ownership.
 The ASF licenses this file to You under the Apache License, Version 2.0
 (the "License"); you may not use this file except in compliance with
 the License.  You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
-->

<!--
 All (relative) paths are relative to the installation path
  
  persistent: Save changes made via the API to this file
  sharedLib: path to a lib directory that will be shared across all cores
-->
<solr persistent="false" sharedLib="lib">

  <!--
  adminPath: RequestHandler path to manage cores.  
    If 'null' (or absent), cores will not be manageable via request handler
  -->
  <cores adminPath="/admin/cores" shareSchema="true">
    <core name="core_bw" instanceDir="core_bw" dataDir="./data" />
    <core name="core_bw2" instanceDir="core_bw2" dataDir="./data" />
  </cores>
</solr>

当然,这个solr.xml也可以从下载的solr中找到,在example\solr下。

2. 把每个core对应的data、conf文件夹分别copy到solr.home下的instanceDir目录下(core_bw和core_bw2),启动tomcat,则能看到solr管理页面cores栏,会出现两个可选核心。

3. 以前的访问路径需要做相应修改,例如:
前:http://localhost:8080/solr/search1
后:http://localhost:8080/solr/core_bw/search1

原文介绍:
Configuring solr.xml
When Solr starts up, it checks for the presence of a solr.xml  file in the  solr.home 
directory. If one exists, then it loads up all the cores defined in  solr.xml . We've used 
multiple cores in the sample Solr setup shipped with this book to manage the various 
indexes used in the examples.
You can see the multicore configuration at  ./examples/cores/solr.xml:
<solr persistent="false" sharedLib="lib">
  <coresadminPath="/admin/cores" shareSchema="true">
    <core name="mbtracks" instanceDir="mbtype"  
      dataDir="../../cores_data/mbtracks" />
    <core name="mbartists" instanceDir="mbtype"   
      dataDir="../../cores_data/mbartists" />
    <core name="mbreleases" instanceDir="mbtype"  
      dataDir="../../cores_data/mbreleases" />
    <core name="crawler" instanceDir="crawler"  
      dataDir="../../cores_data/crawler" />
    <core name="karaoke" instanceDir="karaoke"  
      dataDir="../../cores_data/karaoke" />
  </cores>
</solr>
Chapter 8
[ 257 ]
Notice that three of the cores: mbtracks ,  mbartists, and  mbreleases all share the 
same  instanceDir of mbtype? This allows you to make configuration changes in 
one location and affect all three cores.
Some of the key multicore configuration values are:
•	 persistent="false"  specifies that any changes we make at runtime to the 
cores, like renaming them, are not persisted. If you want to persist changes 
to the cores between restarts, then set  persistent="true". Note, this means 
the  solr.xml  file is regenerated without any original comments and requires 
the user running Solr to have write access to the filesystem.
•	 sharedLib="lib" specifies the path to the  lib  directory containing shared 
JAR files for all the cores. On the other hand, if you have a core with its own 
specific JAR files, then you would place them in the core/lib  directory. For 
example, the karaoke core uses Solr Cell (see  Chapter 3,  Indexing Data ) for 
indexing rich content, so the JARs for parsing and extracting data from rich 
documents are located in ./examples/cores/karaoke/lib/.
•	 adminPath specifies the URL path at which the cores can be managed at 
runtime. There's no need to change it from "/admin/cores". See below for 
details on the various operations you perform to cores.
•	 shareSchema allows you to use a single in-memory representation of the 
schema for all the cores that use the same  instanceDir. This can cut down 
on your memory use and startup time, especially in situations where you 
have many cores, like if you are streaming in data and generating cores on 
the fly. If you are interested in using many cores, you should keep an eye on 
SOLR-1293 which is the umbrella JIRA issue for Solr that fully supports lots 
of cores. I have seen Solr run with dozens of cores with no issues beyond 
increased startup time.
•	 defaultCoreName, if present defines the core to use if you don't include the 
core name in the URL, that is /solr/select?q=*:* . This makes it easier to 
upgrade from a single core Solr to a multicore setup without changing client 
URLs.
Each core is configured via a fairly obvious set of properties:
•	 name specifies the name of the core, and therefore what to put in the URL to 
access the core.
•	 instanceDir specifies the path to the directory that contains the  conf 
directory for the core, and  data directory too, by default. A relative path 
is relative to  solr.home. In a basic single-core setup, this is typically set to 
the same place as  solr.home. In the preceding example we have three cores 
using the same configuration directory, and two that have their own specific 
configuration directories.
Deployment
[ 258 ]
•	 dataDir specifies where to store the indexes and any other supp orting data, 
like spell check dictionaries. If you don't define it, then by default each core 
stores its information in the <instanceDir>/data  directory.
•	 properties="test.properties"  allows you to specify a properties file 
made up of name value pairs to load for the core. This is either an absolute 
path or relative to the  instanceDir


猜你喜欢

转载自blackwing.iteye.com/blog/1480065