solr的ReplicationHandler提供了一系列http命令（参数command），支持的可选值如下：
1）indexversion：slave从master获取最新的索引点信息。 http://master_host:port/solr/replication?command=indexversion
2）filecontent：slave从master下载指定文件的内容。
3）filelist：slave从master获取指定indexversion的索引文件列表（及需要复制的配置文件）。http://host:port/solr/replication?command=filelist&indexversion=<index-version-number>
4）backup：备份索引。如果担心索引有损坏的可能性，可以定期备份索引。http://master_host:port/solr/replication?command=backup

参数location:可以指定备份路径。

试了下 &location=bk 找了一下生成的备份的文件的位置，竟然在%tomcat-hom%/bin下生成。

看了代码才知道可以指定绝对路径：比如 &location=/diska/solrSearch/solr_vid_album/data/bk

备份的索引就在/diska/solrSearch/solr_vid_album/data/bk 生成一个镜像备份snapshot.20110129015833目录。

刚刚再看了一下代码，发现有一个参数可以设置备份的映像个数，参数名为：numberToKeep，每次索引机器都可以在定时备份索引

  private void doSnapShoot(SolrParams params, SolrQueryResponse rsp, SolrQueryRequest req) {
    try {
      int numberToKeep = params.getInt(NUMBER_BACKUPS_TO_KEEP, Integer.MAX_VALUE);
      IndexDeletionPolicyWrapper delPolicy = core.getDeletionPolicy();
      IndexCommit indexCommit = delPolicy.getLatestCommit();
      
      if(indexCommit == null) {
        indexCommit = req.getSearcher().getReader().getIndexCommit();
      }
      
      // small race here before the commit point is saved
      new SnapShooter(core, params.get("location")).createSnapAsync(indexCommit, numberToKeep, this);
      
    } catch (Exception e) {
      LOG.warn("Exception during creating a snapshot", e);
      rsp.add("exception", e);
    }
  }

删除旧的映像的代码也贴一下：

  private void deleteOldBackups(int numberToKeep) {
    File[] files = new File(snapDir).listFiles();
    List<OldBackupDirectory> dirs = new ArrayList<OldBackupDirectory>();
    for(File f : files) {
      OldBackupDirectory obd = new OldBackupDirectory(f);
      if(obd.dir != null) {
        dirs.add(obd);
      }
    }
    Collections.sort(dirs);
    int i=1;
    for(OldBackupDirectory dir : dirs) {
      if( i > numberToKeep-1 ) {
        SnapPuller.delTree(dir.dir);
      }
    }   
  }

主要有一个映像类

OldBackupDirectory

保存相关的信息，实现了一个比较器，以便于排序，删除，主要以时间戳比较

  private class OldBackupDirectory implements Comparable<OldBackupDirectory>{
    File dir;
    Date timestamp;
    final Pattern dirNamePattern = Pattern.compile("^snapshot[.](.*)$");
    
    OldBackupDirectory(File dir) {
      if(dir.isDirectory()) {
        Matcher m = dirNamePattern.matcher(dir.getName());
        if(m.find()) {
          try {
            this.dir = dir;
            this.timestamp = new SimpleDateFormat(DATE_FMT).parse(m.group(1));
          } catch(Exception e) {
            this.dir = null;
            this.timestamp = null;
          }
        }
      }
    }
    public int compareTo(OldBackupDirectory that) {
      return that.timestamp.compareTo(this.timestamp);
    }
  }

5）fetchindex：手动复制数据，和slave自动复制相当。http://slave_host:port/solr/replication?command=fetchindex

可以再指定主机的URL　　　&masterUrl=http://172.16.200.1:8080/solr/replication

6）disablepoll：停止slave的复制。http://slave_host:port/solr/replication?command=disablepoll
7）enablepoll：开启slave的复制。http://slave_host:port/solr/replication?command=enablepoll
8）abortfetch：终止slave上正在进行的下载文件过程。http://slave_host:port/solr/replication?command=abortfetch
9）commits：show当前仍旧保留的IndexCommit信息。
10）details：show slave当前的复制细节信息。http://slave_host:port/solr/replication?command=details
11）enablereplication：启动master对所有slave的复制功能 http://master_host:port/solr/replication?command=enablereplication
12）disablereplication：关闭master对所有slave的复制功能 http://master_host:port/solr/replication?command=disablereplication

比如主机为：http://172.16.200.1:8080/solr/

子机为：http://172.16.200.2:8080/solr/

手动复制数据，触发分发请求：

http://172.16.200.2:8080/solr/replication?command=fetchindex&masterUrl=http://172.16.200.1:8080/solr/replication

可以写成一个脚本，在系统的定时任务上定时运行脚本。

测试过1G的索引，速度很快，几乎不用担心服务中止。

以下引自http://www.kafka0102.com/2010/07/249.html

1、master的工作

对于ReplicationHandler的复制功能来说，核心的问题确定是在一个时间点要复制哪些文件，这就用上了lucene的IndexDeletionPolicy的特性。lucene在初始化时，会调用IndexDeletionPolicy.onInit(List<? extends IndexCommit> commits)方法；lucene在commit（触发的时机也可以是optimize、close，solr在commit时实际上就是close了indexwriter）时，会调用IndexDeletionPolicy.onCommit(List<? extends IndexCommit> commits)。IndexCommit对象中保存了该次提交关联的文件列表等信息，这使得solr中的复制过程中，slave可以从master得到文件列表后跟本地文件做比较，跳过不变的文件，下载新文件，并删除无用的文件。IndexDeletionPolicy的两个针对commits的函数，会对当前存在的commits列表做些处理，比如lucene默认的KeepOnlyLastCommitDeletionPolicy会只保留最新的IndexCommit，对那些过时的IndexCommit执行delete操作以将无用的文件删掉。solr中，SolrDeletionPolicy默认也是保留最新一个IndexCommit，但可以设置maxCommitAge、maxCommitsToKeep、maxOptimizedCommitsToKeep来保留更多的IndexCommit。但solr真正使用的IndexDeletionPolicy实现是IndexDeletionPolicyWrapper，它是SolrDeletionPolicy的wrap。在slave从master复制文件的过程中，要保证当前正在复制的IndexCommit点不能被删除，这就用到了IndexDeletionPolicyWrapper中的void setReserveDuration(Long indexVersion, long reserveTime)方法，该方法会在master向slave响应indexversion、filelist命令前、以及每向slave传送5M的索引文件内容时调用，而默认的reserveTime时间是10s，如果慢速网络传输5M数据需要10秒以上，就需要调整该值了。

ReplicationHandler复制文件没有采用rsync，而是使用http，它在读一个文件内容传输到slave时，默认是按照1M大小分段输出内容到slave（http chunked？），并且默认是对每段内容做了checksum，保证传输的内容的正确性。上面提到的setReserveDuration点，主要就是它在packetsWritten % 5 == 0次数后触发一次修改。

ReplicationHandler还可以备份索引文件。由于lucene的索引文件只是追加新文件而不会修改已有文件，所以只要针对一个IndexCommit点做备份，其过程还是很简单的。

2、slave的工作

slave启动时会创建SnapPuller对象，SnapPuller会启动一个线程定时的（pollInterval间隔）从master复制数据（fetchLatestIndex方法）。对于一次复制过程，slave和master交互处理细节如下：
1、slave首先向master询问最新的索引版本号（indexversion命令），slave检查得到的latestVersion、latestGeneration有效后，和本地的IndexCommit的getVersion()、getGeneration()比较，如果不相等，则需要往下进行，否则等待下一次调度。

  boolean isFullCopyNeeded = commit.getGeneration() >= latestGeneration;
      File tmpIndexDir = createTempindexDir(core);

     if (isIndexStale())//如果本地有文件跟master同名但大小不一样则认为从索引被破坏，需要完全下载
        isFullCopyNeeded = true;
      successfulInstall = false;

2、slave向master请求之前得到的indexversion下的文件列表（filelist命令，包括索引文件和可选的配置文件）。如果文件列表为空，则返回等待下一次调度。否则，就需要检查哪些文件需要被下载过来。这里做的判断有：1）如果本地的commit.getGeneration() >= latestGeneration，说明本地索引文件被破坏（比如对slave不小心提交了修改索引的命令），需要完全将master的文件复制过来。2）逐个检查文件列表中的文件是否在本地存在，不存在就下载下来。

  private void downloadIndexFiles(boolean downloadCompleteIndex, File tmpIdxDir, long latestVersion) throws Exception {
    for (Map<String, Object> file : filesToDownload) {
      File localIndexFile = new File(solrCore.getIndexDir(), (String) file.get(NAME));
      //文件不存在或者需要完全下载的时候才会下载
      if (!localIndexFile.exists() || downloadCompleteIndex) {
        fileFetcher = new FileFetcher(tmpIdxDir, file, (String) file.get(NAME), false, latestVersion);
        currentFile = file;
        fileFetcher.fetchFile();
        filesDownloaded.add(new HashMap<String, Object>(file));
      } else {
        LOG.info("Skipping download for " + localIndexFile);
      }
    }
  }

3、对于下载文件内容，对应命令是filecontent。下载的文件显然需要放到临时目录中，这个临时目录和已有的索引目录（默认名字index）在同一数据目录下，只是命名为index.<时间戳>。下载完毕后，copy数据有两种情况：1）如果是完全下载，则不需要将临时目录中的文件copy到已有目录中，而是修改数据目录中的index.properties，标识索引目录为新生成的临时目录，而旧索引目录并不会被删除，可以手工删掉，当然，通常是不应该出现slave的Generation大于master的异常情况。2）通常就是把临时索引目录的文件copy到旧索引目录，copy时要把segments_N放到最后copy，避免copy中途出现异常造成数据被毁。

 private boolean copyAFile(File tmpIdxDir, File indexDir, String fname, List<String> copiedfiles) {
    File indexFileInTmpDir = new File(tmpIdxDir, fname);
    File indexFileInIndex = new File(indexDir, fname);
    boolean success = indexFileInTmpDir.renameTo(indexFileInIndex);
    if(!success){
      try {
        LOG.error("Unable to move index file from: " + indexFileInTmpDir
              + " to: " + indexFileInIndex + "Trying to do a copy");
        FileUtils.copyFile(indexFileInTmpDir,indexFileInIndex);
        success = true;
      } catch (IOException e) {
        LOG.error("Unable to copy index file from: " + indexFileInTmpDir
              + " to: " + indexFileInIndex , e);
      }
    }
    if (!success) {
      for (String f : copiedfiles) {
        File indexFile = new File(indexDir, f);
        if (indexFile.exists())
          indexFile.delete();
      }
      delTree(tmpIdxDir);
      return false;
    }
    return true;
  }

4、当新索引和可选的配置文件copy完毕之后，slave会对solrcore的UpdateHandler做commit操作，这会close掉indexwriter并强制重启新的indexsearcher提供服务。同时，如果solrcore的UpdateHandler是DirectUpdateHandler2（不应该不是），会强制调用handler.forceOpenWriter()来删除旧的无用的索引文件，并调用replicationHandler.refreshCommitpoint()来更新slave的indexCommitPoint。

5、如果索引复制失败，slave会向数据目录下的replication.properties输出复制失败的信息。

转自：http://blog.csdn.net/duck_genuine/article/details/6165314

知识点

相关文章

最近更新

solr1.4 replication分发知识

1、master的工作

2、slave的工作

相关问答

在分布式Solr配置中复制(Replication in distributed Solr config)[2022-11-14]

两个独立的Solr服务器的双向复制(Two-directional replication of two separate Solr servers)[2021-11-06]

Solr 1.4和Solrj 4.6需要哪些兼容性更改？(What compatibility changes required for Solr 1.4 and Solrj 4.6?)[2022-04-08]

在自定义时启动solr复制(Starting solr replication at custom time)[2022-02-09]

Solr在复制期间重新解释字段(Solr reinterprets field during replication)[2023-07-02]

在solr 1.4中突出显示时显示所有出现的查询(Show all occurrences of query while highlighting in solr 1.4)[2023-06-11]

Solr 1.4 Date Facet Include(Solr 1.4 Date Facet Include)[2022-02-27]

Solr Cloud Data Import Handler复制速度慢(Solr Cloud Data Import Handler slow with replication)[2023-03-07]

Solr Replication和Solr Cloud有什么区别？(What is the difference between Solr Replication and Solr Cloud?)[2022-08-11]

从solr 4.10.3复制到solr 5.3(Replication from solr 4.10.3 to solr 5.3)[2020-02-15]