NO META Robots for the top page of FreyaSX with "powered by" findex/fmerge via CGI fsx pub index ---> add access-control list $FSXHOME/etc/fhttpd.conf and idx.lst proxy <- query as URL(host name) -> search in url:HOST -> search in *:HOST LexiconEntry: recording the number of document which referres the Word ... show documents linked from this document ... show documents linking to this document Key shown in the context as a Link -> show more wide context in the original ... and [2][3][4]... for other links not displayed in the first display Frequenty Asked Question -- Frequenty Retrieved Words adding HREF link for output in link:URL result ouput add Referrer page automatically if the page really includes the link for me CRC of URL (and meta info), CRC of content > .dsc ... url-crc is used to check update FreyaSx entry via Proxy-DeleGate ... http://-.-/fsx/ LINK prev/next inoformation in the original HTML like slides ... use it as a link asis in "one-document per display" mode CGI when switched,insert STAB for docid=xxx where xxx does not match new key=.. PROXY -> FTSE ... count up access count of URL when URL is relayed on the proxy ... send http://ftse/countup?URL when URL is relayed on the proxy ... ICE or HTTP on UDP ... using HTTP over UDP, sending HTTP Header (URL, Last-Modified, ETag, ...) ... SYSLOG: send PROTOLOG on SYSLOG -> SYSLOG to FreyaSX counter ... any2fdif.conf ... cache expire and proxy http://host:8880/fsx/ http://host:8880/fsx/admin? etc/index.lst 0.99.18 140817 any2fdif.c: suppressed indexing a link in LINKS duplicatedly 0.99.18 140817 any2fdif.c: fixed full-url in LINKS 0.99.18 140728 any2fdif.c: introduced "-y include" option 0.99.18 140708 makevc.sh,make-vs8_win8.bat: build script for Windows 0.99.17 140708 SXc.{h,c},findex.cc: invocation from non-Command-Prompt terminal 0.99.17 140708 makevc.sh: building FreyaSX on Windows 0.99.17 140703 any2fdif.c: coped with hops limitation with multiple "-r URL" 0.99.17 140702 any2fdif.c: don't index duplicatedly but rescan links under different context 0.99.17 140702 any2fdif.c: escaped "VStr overflow" in HTML entity encoding 0.99.17 140621 DescRecord.cc: fixed uninitialized description string 0.99.17 140620 {rfc2html,any2fdif}.c: META-revised to represent the date of document 0.99.17 140619 fsearchcgi.cc: do sort=url-reverse as the last comparation 0.99.17 140617 *.{h,c}: coped with compilation with compilers nowadays 0.99.17 140616 fsearchcgi.cc: counter by URL if $FSXHOME/bank/_shared/ exists 0.99.17 140604 any2fdif.c: strip duplicated (old) "[xxx]" in Subject 0.99.17 140602 findex.cc: fixed default FDIF input when invoked from cron 0.99.17 140601 fsearchcgi.cc: introduced hidden_key=Key to be hidden key 0.99.17 140529 any2fdif.c: tentative "-all" option to index "xxx?query" too 0.99.17 140527 any2fdif.c: text to skip indexing the text 0.99.17 140527 any2fdif.c: ignore secondary not to index it 0.99.17 140527 fsearchcgi.cc: $cgibase to be used in HTML templates 0.99.17 140524 any2fdif.c: index any relative URL too 0.99.17 140524 any2fdif.c: index links upto 16K bytes (<- 2048) 0.99.17 140524 any2fdif.c: index links for FORM ACTION and IMG SRC too 0.99.17 140524 any2fdif.c: make index with URL of after 301/302 redirection for the -r URL 0.99.17 140524 any2fdif.c: suppressed "VStr overflow" with -Fany2fdif -r (for Links buffer) 0.99.17 140524 any2fdif.c: SEGV in isspace() for values negative or larger than 0xFF (Ubunts) 0.99.16 060718 SX.cc: added -cNKL=n for the minimum key length to be indexed 0.99.16 060718 WordBreaker.h: added '_' as leading char. of word to be indexed 0.99.15 060116 Indexer: earlier switch from elfhash to CRC32 on heavy coll. 0.99.15 060115 fsearch.cgi: fixed SEGV on index=x&index=y (0.99.1)<setter> 0.99.14 060112 Query: searching a single Japanese character as prefixsearch 0.99.14 060111 WordBreaker: fixed to add the last character in a Japanese word 0.99.14 051013 fsx,ConText: added "dumpctx" command 0.99.13 051011 Retriever: introduced "^KanaWord" to exclude intermediate Kana 0.99.13 051010 fsearch.cgi: extended to highlight any keywords in digest/context 0.99.13 051010 findex: modified not to use dictionary by default (without -D) 0.99.13 051010 Query: automatic retrying prefix-search for Japanese word 0.99.12 051004 Indexer: dynamic hash-function upgrade detecting collsion rate 0.99.12 051001 Indexer,*.{c,h}: coped with index larger than 2GB (int -> off_t) 0.99.12 050927 HashBag.h,HeapBag.h: coped with Gcc4 0.99.11 041015 findex: automatic adding index to "bank/index.lst" 0.99.11 041012 fsearch.cgi: fixed '\0' in logfile (since 0.93) 0.99.11 040922 fsx: added FORM/HTML interface to edit "index.lst" 0.99.11 040920 any2fdif: introduced "any2fdif.conf" 0.99.10 040918 findex: changed to run even without implicit "-d" MORPHDIC 0.99.10 040914 fsearch: made "-b" optional ("default" become unavailable) 0.99.10 040914 fsx: added "dumplex -i idx" command 0.99.10 040913 any2fdif: introduced suppressing too long line in FDIF output 0.99.10 040912 any2fdif: renamed obsolete (back. compati.) ".dsc" to ".desc" 0.99.10 040911 any2fdif: fixed disabled quoted-printable decoding (0.99.9) 0.99.10 040911 any2fdif: coped with MBOX format (Unix Mailbox format) 0.99.10 040911 any2fdif: fixed to see $FSXHOME (8-O) 0.99.10 040911 any2fdif: conversion filer for each file type ... -c.ext 0.99.10 040910 any2fdif: coped with VC++ 0.99.9 040910 fsearchcgi: fixed Floating Exception when no ConText is found 0.99.9 040910 any2fdif: supported scanning directory by itself (-r dir option) 0.99.9 040909 any2fdif: supported getting remote documents specified in URLs 0.99.9 040909 any2fdif: fixed loop on HTML entity on Linux (on bakward fseek()) 0.99.9 040909 any2fdif: become a part of DeleGate 0.99.9 040909 any2fdif: added "-c cnv" option to preprocess input data 0.99.9 040908 ConText: fixed SEGV on free() (by delete []ptr) by findex -a 0.99.9 040908 fmerge,fsearch: made -b option (find in $FREYASX/bank) default 0.99.9 040908 any2fdif,findex: enabled find|any2fdif|findex idx ... without -b 0.99.9 040906 fsearchcgi: introduced docid=url.URL -> docid=xid.did conversion 0.99.9 040906 fsearchcgi: introduced redirecting find=URL => 302 to docid=x.y 0.99.9 040906 fsearchcgi: added sort by author 0.99.8 040904 *: ported onto Win32 0.99.8 040903 Retriever: fixed dust on prefix* search which caused SIGSEGV 0.99.8 040903 fsearch.cgi: fixed automatic detection of resp. charset 0.99.8 040902 Result.h: fixed initialization of found ConText list 0.99.7 040902 SX: introduced mapping from index=xxx.yyy to bank/xxx/yyy 0.99.7 040902 fsearch.cgi: URL&vote=xxx => count-up then => 302-moved to URL 0.99.7 040902 any2fdif: introduced -m "url1 url2" option to rewrite URL 0.99.7 040902 any2fdif: introduced -e sed-command option to edit URL 0.99.7 040901 fsearch.cgi: fixed to do %XX encoding for non-ASCII in anchors 0.99.7 040901 any2fdif: added Message-ID + References to <URL> + and <LINK> 0.99.7 040901 fsx: introduced fsx as an integrated interface for FreyaSX 0.99.6 040831 fsearch.cgi: escaped SIGSEGV on replace_string() 0.99.6 040831 ConText,NumFile,LexiconFile,XMap: be portable (net. byte order) 0.99.6 040831 Makefile,SX,XMap,kcnlib: modified for porting 0.99.6 040831 fsearch.cgi: changed to "show all" for empty query (lacking key=) 0.99.6 040830 fsearch.cgi: added sorting by clicking each keys on each doc. 0.99.6 040829 fsearch.cgi: added a mode to display detailes of a document 0.99.5 040827 fsearch.cgi: introduced "jump=URL" to count selections "$scount" 0.99.5 040827 Indexer: introduced ".num" file as the collection of numbers 0.99.5 040826 findex,fmerge,Indexer: implemented purging duplicated documents 0.99.5 040825 DescRecord,DescFile,Indexer: added CRC32 into .dsc file 0.99.5 040824 Indexer,any2fdif: any2fdif -> pass FDIF only -> findex -> DSC 0.99.5 040824 findex,Indexer: indexing from FDIF and append to existing idx. 0.99.5 040823 fmerge: implemented merging sortfiles 0.99.4 040822 fsearch.cgi,SX: added "vote" interface and sort by vote count 0.99.4 040822 fmerge,Indexer,IndeXFile,LexiconEntry,ConText: added deletion 0.99.3 040821 fsearch.cgi: modified to put "Content-Type: text/html; charset=" 0.99.3 040820 fsearch.cgi: replaced incremental "cout" by "stringstream sout" 0.99.2 040820 Context.c: fixed duplicated display of keywords in a same context 0.99.2 040819 fsearch.cgi: fixed to ignore empty index= 0.99.2 040818 LexiconFile,ConText: changed "fstream" to "Fstream" for speed-up 0.99.2 040817 Coder,IndexFile,SX: replaced variable bits Coder with byte coder 0.99.2 040816 fsearch.cgi: randomized stack 0.99.1 040815 SX,fsearch.cgi: introduced "index.lst" 0.99.1 040815 any2fdif: added <META> for Last-Modified and Content-Location 0.99.1 040815 Indexer: fixed the size of document in XMap (0.98.0) 0.99.1 040815 fsearch.cgi: supported sot by "size" 0.99.1 040814 Context: fixed to entitize "<" in short word in Context (0.99.0) 0.99.1 040814 fsearch.cgi: fixed multiple (more than twice) index=... (0.99.0) 0.99.1 040814 WordBreaker: changed to identify word like NeXT (-c mcc=2) 0.99.1 040814 fsearch.cgi: refined FORM interface and icons 0.99.1 040814 fsearch.cgi: changed to output statistics for empty query 0.99.1 040813 IndexFile,Indexer: fixed SEGV on large number of occurences 0.99.0 040812 Retriever: fixed prefix serarch for JapaneseWord 0.99.0 040810 fsearch.cgi: fixed to refresh cache after update of executable 0.99.0 040809 Expression: fixed searching url:"a b" (since 0.98.0) 0.99.0 040809 DescFile: fixed SEGV on sorting URLs without ':' (since 0.98.2) 0.99.0 040807 Indexer,fsearch.cgi: introduced context display with .ctx 0.99.0 040807 fsearch.cgi: multiple index=idx1&index=idx2 -> index=idx1+idx2 0.99.0 040807 fsearch.cgi: introduced $zoom={big,small} 0.98.2 040806 WordBreaker: fixed not to skip word sequence number (0.98.0) 0.98.2 040806 fsearch.cgi: fixed not to apply Ja->En conv. for key=... param. 0.98.2 040806 Retriever: fixed Japanese phrase search (0.98.0) 0.98.2 040805 findex,fmerge,DescFile: added merging (gen.) of "idx+url.srt" 0.98.2 040805 fsearch.cgi: sticky button for each display of document 0.98.2 040805 fsearch.cgi,DescFile: introduced sort by URL (sort=url) 0.98.2 040804 Retriever: wild card like link:www.*.org" 0.98.2 040804 findex,fmerge,Indexer: introduced -S option to show total stat. 0.98.1 040804 Indexer: a fix for fmerge (0.98.0) 0.98.0 040803 Indexer: simplified the merge algolithm 0.98.0 040802 LexiconEntry: introduced compression in LexiconEntry 0.98.0 040802 LexiconEntry: become extensible/customizable (-c mli=N) 0.98.0 040802 Indexer,HashBag: added HashBag::sortdump() for direct fast dump 0.98.0 040802 findex,Indexer: Unifier become optional (activated with -j) 0.98.0 040801 CachedIndexFile: has come to be unused 0.98.0 040801 IndexFile: become simple plain file 0.98.0 040801 fmerge: coped with 64bits pos_t 0.98.0 040731 any2fdif: changed the order of <ATTR>s (incompatible with olders) 0.98.0 040731 any2fdif: coped with text/plain with JIS-code conversion 0.98.0 040731 Retriever: fixed prefix* search (sort of 64bits pos_t)(0.97.1) 0.98.0 040730 SX: improved compression of pos_t by shuffring 0.98.0 040729 Word,WordBreaker: reduced the cost of alloc/copy/free of string 0.98.0 040729 Indexer,LexconEntry: fixed merging on MacOSX (premature eof) 0.98.0 040729 fsearch.cgi: introduced version number display by "$version" 0.98.0 040728 SX,findex: introduced "-c MBW=256" to expand MAXBUFFRINGWORDS 0.98.0 040727 XMap,fmerge: changed to be a sequences of data size (4byte int) 0.98.0 040727 IndexFile: imporoved speed decreased at 0.97.1 (stack exhaution) 0.98.0 040727 typedef,Coder: made more adaptable between long and long long 0.98.0 040727 Query,Indexer,SX: pointer as the sequence number of words 0.97.1 040725 IndexFile: fixed stack memory exhaution on large unpacked data 0.97.1 040725 Coder: expanded to process 64bits 0.97.1 040725 typedef,SX: changed pos_t to 64bits {docid:31,type:6,offset:26} 0.97.1 040725 SX: restored phrase search (disabled at 0.97.0) 0.97.0 040724 any2fdif,SX: introduced <DIGEST> and digest:word 0.97.0 040724 Indexer,Query: made <TYPE> in FDIF and type:word query extensible 0.97.0 040723 fmerge: coped with searching in $FSXHOME/bank 0.97.0 040722 findex,Indexer,SX: don't overwrite exiting files till the finish 0.97.0 040722 SX: moved icot.dic from dict/ to etc/ 0.97.0 040722 any2fdif: added automatic detection of RFC822 file (without -fm) 0.97.0 040722 any2fdif: fixed searching non-HTTP-cache files (8.9.6-pre10) 0.97.0 040721 DescFile: introduced mmap for DescFile 0.97.0 040721 Indexer,WordBreaker,SX: offset as {docid:20,lines:7,type:5} 0.97.0 040721 Coder: fixed infinite loop on diffpack on decremental diff. 0.97.0 040721 any2fdif,Indexer,Expression,Query: introduced link:URL 0.96.1 040717 SX.cc: fix for searching "$FSXHOME/bank" (0.96) -- TODO / MEMO -- *** searching ignoring Katakana/Hiragana case *** storing with original case, and unify at search (also for upper/lower) for variable length wild cards ... pos_t<<8|(0xFF&offset) ... use as offset for next word for each pos_t display statistics as cgi.output (as seach result) fmerge .sum, .srt (just by catenation) junping to referer button for each digest ... to junp to "link:self" entry "://" as an word, ://www. any2fdif: fix skip comment ... <!-- without --> in index-ja.shtml of FreyaSX History of past "search expression" of the client ... as selectable menu Filter -- Range of date -- From ... To special (pseudo) word ^ as the end of element, and $ as the end of element -- DISPLAYING ** show records around the record listed at the top of page re-sorting around the selected URL (show result including the URL) show links prior to digest for link:url search generic reverse sort ... notation? sort=xxx-reverse customizable weight for each element types (in text confing. file) template customized for each index ... bank/index/style.html n=1&style=include ... style-include=<!--#include virtual=url> Use ConText for phrase search? -> Conent-Type: text/shtml from CGI script to be processed by WWW server -- SEARCHING -- phrase search by "pa * tte * rn" (regular expression) -- SCORING -- ** sort by reference count ** scoring by reference count of each document about the key ** sorting by vocaburuary of each document give higher score for words of lesser occurence score Word reference count against total words word count per document rather than document size sort by multiple-keys ... sort=author+date sort=author+url sort by Title -- INDEXING -- registering intermediate capital word like DeleGate or NeXT restrict maximum number of occurence (ex. 10th) of a word in a document -- UPDATING -- any2fdif -b xxx ... append to *.fdif and *.dsc with *.hsh (HASH by CRC32) findex -b N +o offset in the fdif and dsc ? -- APPLICATION -- automatic adding URL of Referer into FreyaSX apply to Source Progoram (.{cc,c,h,Makefile} of DeleGate, FreyaSX, etc.) RFCsearch by Freya Search Engine as a viewer for BBS BBS based on NNTP ... cross posting, read-list, cancell, distributed-bbs watching modified URL (antenna) Search Engine + SMTP SPAM filter (relevant message) vin + search-Engine (+ nntp proxy?) -- COLLECTING -- collecting FDIF data from distributed proxies any2fdif for mail ... Reference and Xref to <LINK> sort file as an extended XMap DescFile as an extended XMap DescFile cache as an XMap ... in host byte order if(feof() then clear() ${value:%fmt} ... $atime and sort by $atime for access time of the file --- $lastacc ... as last-access time sort by vote-count (vote via HTTP) sort by the number of access (the number of atime changes) sort by the number of modification (mtime update) MOUNT "/-/fsx?* cgi://data/cgi-bin/fsx/fsearch.cgi?*" use first string in text/plain as a title PREFETCHing any2fdif input ... making TAR format be the input to any2fdif "built-in" icot.dic into findex make Japanese/ICOTDIC optional ${from-1} out of range from Error message to [Japanese] adding words into dictionary * ignore ${include} in digest (user origin string) * ignore non-200 resp * ignore gziped (with Content-Encoding) * cache path "/=" -> URL "/" * favicon.ico sent with text/plain * duplicated entitize of "&" in META DESCRIPTION * restore access time of cache * truncate 2byte char. at boudary uuencode dgmlJa/8161 make "-b" omittable any2fdif ix -f xxx findex ix fmerge ix ix1 ix2 fsearch ix key fsx merge fsx index fsx search fsx stat fsx exist? url ... whether the URL is in or not treat large document (MAX<Words) as a series of sub-documents of(Word<=MAX) any2fdif -e "s/.html$/.txt/" with sed dgmlJa/7695 ... bug in context display ... to be entitized meta-search via the interface of FreyaSX -> Google FAST fmerge without uncompress/compress for Indice not to be rewritten adding second key (by AND OR) into context slots if there is remaining slots searching with MASK (fetching with positive mask) ... for creating ConText for result ... retrieve first N ofr all doc.s in sorted order Hierarchy display ... "Range" display mode like vin (by date) ... "[001-100]" [1999] [2000] ... ... Index per line mdoe ... help message as a target of retrieval ... ?url=help-N.html ... everything as a target of retrieval ... virtual document, virtual index Thread Display ... generic thread display (tree view) based on <LINK> hierarchical (tree structured) indices any2fdif.conf find . | any2fdif | findex idx ... if input is PIPE then "-f -" by default, if output is PIPE then "-b -" -r CACHEDIR -> set http://... non-inherited parameters ... -key=... -sxop= search referrers ... in style-one page NNTP/HTTP gateway -> SHTML (<!--#include) -> FreyaSX interface <!--#include virtual=http://freya/search?url=panel.html&key=url:url-of-this-page ... vote interface, couter display ... link from NNTP/HTTP page to URL= docid= ... libfsx.a ... any2fdif and DeleGate as an library .vot inheriting / merging Votes from existing index posting comment to each document ... allocate URL for each comment ... index each comment in real time ... work as hisroty+bookmark+commentary ConText: record multiple delimiters findex -a idx fdif ... create idx and merge it into the last idx+N if the size+new is ... if the target is smaller than 5Mb Lex,Idx temporary outputs as a single file finding fsxhome: FSXHOME -> ~/freyasx -> ${EXECDIR}/../bank name-sort.SRT ... sorted at index creation time ... mapping sort[x]->docid ... not be usable when multiple indices are merged automatic prev/next by Refresh: header