本网页所有文字内容由 imapbox邮箱云存储,邮箱网盘, iurlBox网页地址收藏管理器 下载并得到。
ImapBox 邮箱网盘 工具地址: https://www.imapbox.com/download/ImapBox.5.5.1_Build20141205_CHS_Bit32.exe
PC6下载站地址:PC6下载站分流下载
本网页所有视频内容由 imoviebox边看边下-网页视频下载, iurlBox网页地址收藏管理器 下载并得到。
ImovieBox 网页视频 工具地址: https://www.imapbox.com/download/ImovieBox4.7.0_Build20141115_CHS.exe
本文章由: imapbox邮箱云存储,邮箱网盘,ImageBox 图片批量下载器,网页图片批量下载专家,网页图片批量下载器,获取到文章图片,imoviebox网页视频批量下载器,下载视频内容,为您提供.
借着URLNormalizers看一下Nutch的插件机制,在Injector类中的configure类中有一句是: urlNormalizers = newURLNormalizers(job, URLNormalizers.SCOPE_INJECT); 它调用的是: publicURLNormalizers(Configuration conf, String scope) { this.conf =conf; this.extensionPoint =PluginRepository .get(conf).getExtensionPoint(URLNormalizer.X_POINT_ID); ObjectCache objectCache = ObjectCache.get(conf); normalizers =(URLNormalizer[]) objectCache .getObject(URLNormalizer.X_POINT_ID + "_" +scope); if (normalizers == null) { normalizers =getURLNormalizers(scope); } if (normalizers == EMPTY_NORMALIZERS) { normalizers =(URLNormalizer[]) objectCache .getObject(URLNormalizer.X_POINT_ID + "_" + SCOPE_DEFAULT); if (normalizers == null) { normalizers =getURLNormalizers(SCOPE_DEFAULT); } } loopCount =conf.getInt("urlnormalizer.loop.count", 1); } 这里的getExtensionPoint是得到相应的扩展点,这里URLNormailizer.X_POINT_ID是org.apache.nutch.net.URLNormalizer,关于扩展点可以看一下IBM的技术文章《Nutch 插件系统浅析》,接下来先到缓存中去找,如果没有找到就调用getURLNormalizers(),如果normalizers==EMPTY_NORMALIZERS说明它应该在缓存里有,如果缓存里存的是null,就用默认的normalizer,而loopCount是在规范化时指定要循环多少次的一个值。getURLNormalizers代码如下: URLNormalizer[]getURLNormalizers(String scope) { List<Extension> extensions =getExtensions(scope); List<URLNormalizer> normalizers = new Vector<URLNormalizer>(extensions.size()); Iterator<Extension> it =extensions.iterator(); while(it.hasNext()) { Extension ext = it.next(); URLNormalizer normalizer = null; try { normalizer = (URLNormalizer)objectCache .getObject(ext.getId()); if(normalizer == null) { // go aheadand instantiate it and then cache it normalizer = (URLNormalizer)ext.getExtensionInstance(); objectCache.setObject(ext.getId(),normalizer); } normalizers.add(normalizer); } catch(PluginRuntimeException e) { } } returnnormalizers.toArray(newURLNormalizer[normalizers.size()]); } 得到相应scope的Extensions,先不去管它是如何得到的,这里将得到的Extension实例化,保存到normalizers中。下面则是getExtensions的代码: privateList<Extension> getExtensions(String scope) { https://c.tieba.baidu.com/p/3476776824
https://c.tieba.baidu.com/p/3476808306
https://c.tieba.baidu.com/p/3476798710
https://c.tieba.baidu.com/p/3474281354
https://c.tieba.baidu.com/p/3474300101
https://c.tieba.baidu.com/p/3474294075
https://c.tieba.baidu.com/p/3474123295
https://c.tieba.baidu.com/p/3474314242
https://c.tieba.baidu.com/p/3474310411
https://c.tieba.baidu.com/p/3474304550
https://c.tieba.baidu.com/p/3475433945
https://c.tieba.baidu.com/p/3475430015
https://c.tieba.baidu.com/p/3475433348
https://c.tieba.baidu.com/p/3475431434
https://c.tieba.baidu.com/p/3474176863
https://c.tieba.baidu.com/p/3474159835
https://c.tieba.baidu.com/p/3474163941
https://c.tieba.baidu.com/p/3474156121
https://c.tieba.baidu.com/p/3474147660
https://c.tieba.baidu.com/p/3474151899
https://c.tieba.baidu.com/p/3474142287
https://c.tieba.baidu.com/p/3474136965
https://c.tieba.baidu.com/p/3474133165
https://c.tieba.baidu.com/p/3474128675
https://c.tieba.baidu.com/p/3474103896
https://c.tieba.baidu.com/p/3474099488
https://c.tieba.baidu.com/p/3474094120
https://c.tieba.baidu.com/p/3475431976
https://c.tieba.baidu.com/p/3474267991
https://c.tieba.baidu.com/p/3474259583
https://c.tieba.baidu.com/p/3474254990
https://c.tieba.baidu.com/p/3474228986
https://c.tieba.baidu.com/p/3474221626
https://c.tieba.baidu.com/p/3474215742
https://c.tieba.baidu.com/p/3474212122
https://c.tieba.baidu.com/p/3474188883
https://c.tieba.baidu.com/p/3474207722
https://c.tieba.baidu.com/p/3474184143
https://c.tieba.baidu.com/p/3474180522
https://c.tieba.baidu.com/p/3474171022
https://c.tieba.baidu.com/p/3474086627
阅读和此文章类似的: 程序员专区