History log of /src/usr.sbin/makemandb/apropos-utils.h |
Revision | | Date | Author | Comments |
1.15 |
| 18-May-2019 |
abhinav | PR misc/54213: Fix performance of whatis(1) when no matches are found
In revision 1.6 of whatis.c the query was modified to return matches for names found in MLINKS of the man pages as well. However it was slow. The reason probably being that it required a join. But more importantly the where condition on an FTS virtual table column is very slow. To avoid the join and the expensive where condition on the virtual table, add the name_desc column to the mandb_links table as well. This improves the performance of whatis(1) to the original level at the expense of slight data duplication.
Bump the schema to force database rebuild to take account for the new column addition
|
1.14 |
| 25-Nov-2017 |
abhinav | branches: 1.14.4; Encapsulate all the arguments required by the query callback function in a struct.
If we want to add or remove arguments from the callback functions, it requires changing the callback interface all over the place. By letting the callback simply expect a single struct argument, it would clean things up a bit.
ok christos
|
1.13 |
| 18-Jun-2017 |
abhinav | Add a custom tokenizer which does not stem certain keywords.
Which keywords should not be stemmed is specified in the nostem.txt file. (Right now I have taken all the man page names, split them if they had underscores, removed common English words and converted everything to lowercase.)
The tokenizer itself is based on the Porter stemming tokenizer shipped with Sqlite. The code in custom_apropos_tokenizer.c is copy of that code with some modifications to prevent stemming keywords specified in nostem.txt.
Additionally, it now uses underscore `_' also as a token delimiter. Therefore, now it's possible to do query for `lwp' and all `_lwp_*' man page names will be matched. Or the query can be `unconst' and `__UNCONST' will be matched. This was not possible earlier, because underscore was not a delimiter and therefore the index would have __UNCONST as a key rather than UNCONST.
The tokenizer needs fts3_tokenizer.h file, which is not shipped with the amalgamation build of Sqlite, therefore it needs to be added here (unless we decide there is a better place for it).
To enforce using the new tokenizer, a schema version bump is needed
Since the tokenization is done both at the indexing time (via makemandb) and also while query time (via apropos or whatis), it will be needed to bump the schema version everytime nostem.txt is modified. Otherwise the index will consist of old tokens and desired changes will not be seen with apropos.
This should also fix the issue reported in PR bin/46255. Similar suggestion was also made on tech-userlevel@ recently: <http://mail-index.netbsd.org/tech-userlevel/2017/06/08/msg010620.html>
Thanks to christos@ for multiple rounds of reviews of the tokenizer code.
|
1.12 |
| 01-May-2017 |
abhinav | Simplify handling of the section arguments in apropos(1).
Earlier, a white space separated string was generated containing all the section numbers passed through command line arguments. Later on that would have to be tokenized and processed. Instead of that, use a NULL terminated array of strings.
Thanks to christos@ for reviewing and suggesting further improvements.
|
1.11 |
| 13-Apr-2016 |
christos | branches: 1.11.6; PR/51062: Abhinav Upadhyay: Allow non numeric sections to be indexed and searched by apropos(1). Fold long lines.
|
1.10 |
| 13-Apr-2016 |
christos | PR/51038: Abhinav Upadhyay: check for access permissions to the sqlite database
|
1.9 |
| 02-Apr-2013 |
christos | instead of having a format and no format flag, and exposing various formatters, provide a format enum and expose html formatting too.
|
1.8 |
| 29-Mar-2013 |
christos | - If the stdout is not a tty, prevent formatting unless forced with -i - Don't ever page unless asked for with -p - Introduce "legacy mode" (-l) 1. searches only name and name_desc, prints name(section) - name_description 2. turns off escape formatting (can be forced on with -i) 3. turns off context printing (can be forced on with -c) - Parse the environment $APROPOS variable as an argument vector.
With these changes one can simply 'export APROPOS=-l' and get the old apropos behavior.
|
1.7 |
| 10-Feb-2013 |
christos | add -r flag to elide tty formatting
|
1.6 |
| 10-Feb-2013 |
christos | remove trailing whitespace
|
1.5 |
| 14-Jan-2013 |
christos | - move the terminal handling in apropos-utils.c since htmp and pager are also handled there. - underline the name, section, and description so that it is prettier. - change to bold terminal the terminal highlighting to match with less
|
1.4 |
| 06-Oct-2012 |
wiz | Make mandb path configurable. makemandb (and related tools) use the path from the _mandb variable from man.conf now.
Set _mandb in man.conf to same value as was used before.
From Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>.
|
1.3 |
| 07-May-2012 |
wiz | branches: 1.3.2; PR 46419 by Abhinav Upadhyay using his updated patch: Clean up after removing man page aliases.
|
1.2 |
| 07-Feb-2012 |
joerg | branches: 1.2.2; 1.2.4; Fix C&P error with $NetBSD$
|
1.1 |
| 07-Feb-2012 |
joerg | Import the new apropos/whatis.
This code has been developed by Abhinav Upadhyay as part of Google's Summer of Code 2011. It uses libmandoc to parse man pages and builds a Full Text Index in a SQLite database. The combination of indexing the full manual page, filtering out stop words and ranking individual matches based on the section gives a much improved user experience.
The old makewhatis and friends are kept under MKMAKEMANDB=no for now.
|
1.2.4.6 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.2.4.5 |
| 23-Jan-2013 |
yamt | sync with head
|
1.2.4.4 |
| 30-Oct-2012 |
yamt | sync with head
|
1.2.4.3 |
| 23-May-2012 |
yamt | sync with head.
|
1.2.4.2 |
| 17-Apr-2012 |
yamt | sync with head
|
1.2.4.1 |
| 07-Feb-2012 |
yamt | file apropos-utils.h was added on branch yamt-pagecache on 2012-04-17 00:09:49 +0000
|
1.2.2.1 |
| 09-May-2012 |
riz | Pull up following revision(s) (requested by wiz in ticket #229): usr.sbin/makemandb/makemandb.c: revision 1.9 usr.sbin/makemandb/DBSCHEMA: revision 1.2 usr.sbin/makemandb/apropos-utils.c: revision 1.5 usr.sbin/makemandb/apropos-utils.h: revision 1.3 PR 46419 by Abhinav Upadhyay using his updated patch: Clean up after removing man page aliases.
|
1.3.2.3 |
| 23-Jun-2013 |
tls | resync from head
|
1.3.2.2 |
| 25-Feb-2013 |
tls | resync with head
|
1.3.2.1 |
| 20-Nov-2012 |
tls | Resync to 2012-11-19 00:00:00 UTC
|
1.11.6.1 |
| 02-May-2017 |
pgoyette | Sync with HEAD - tag prg-localcount2-base1
|
1.14.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|