History log of /src/usr.sbin/makemandb/apropos-utils.c |
Revision | | Date | Author | Comments |
1.51 |
| 03-Aug-2023 |
rin | makemandb: trailing whitespace
|
1.50 |
| 11-Sep-2022 |
gutteridge | makemandb/*: fix spelling of database and consistency of SQLite
|
1.49 |
| 19-May-2022 |
gutteridge | apropos(1): improve error handling in edge cases
Patch from RVP on NetBSD-Users, with an additional comment tweak by me. Summary from RVP:
1. Ignore SIGPIPE so that we're not killed in the middle of some DB operation by a botched $PAGER:
$ env PAGER=/non-existent apropos -p ...
2. Return proper exit status in case of write errors:
$ apropos ... >/dev/full || echo fail
|
1.48 |
| 27-Nov-2021 |
rillig | usr.sbin: remove unnecessary CONSTCOND, lint no longer needs it
Since 2021-01-31, lint no longer requires a CONSTCOND comment in a do-while-0 statement since this is a common code pattern, especially in statement-like macros.
sed -i -E 's,} while \(/\* ?CONSTCOND ?\*/ ?0\),} while (0),' */*.[ch]
|
1.47 |
| 18-Aug-2019 |
abhinav | PR bin/54343: We want the callback_args.machine to be NULL if it is not present in the DB.
The previous commit fixed the problem of allowing apropos to not crash and produce output even if the database is missing values for certain mandatory fields, such as name, section etc. Normally we don't expect those values to be missing in the database but in case of parsing errors it can happen.
However, the machine architecture is an optional field since not all man pages are hardware specific so that should be allowed to be set to NULL if not present in the database.
|
1.46 |
| 15-Aug-2019 |
christos | PR/54343: Prevent NULL pointers in callback strings; use "*?*" for now to identify them.
|
1.45 |
| 07-Jun-2019 |
leot | branches: 1.45.2; Properly free section_clause.
|
1.44 |
| 18-May-2019 |
abhinav | PR misc/54213: Fix performance of whatis(1) when no matches are found
In revision 1.6 of whatis.c the query was modified to return matches for names found in MLINKS of the man pages as well. However it was slow. The reason probably being that it required a join. But more importantly the where condition on an FTS virtual table column is very slow. To avoid the join and the expensive where condition on the virtual table, add the name_desc column to the mandb_links table as well. This improves the performance of whatis(1) to the original level at the expense of slight data duplication.
Bump the schema to force database rebuild to take account for the new column addition
|
1.43 |
| 19-Apr-2019 |
abhinav | Memory allocated by sqlite3_mprintf should be free'd by sqlite3_free
This was causing memory corruption thus making apropos(1) fail in some cases. Specifically following options were broken and should be fixed with this commit:
-n option was causing a core dump apropos was giving warning when using -l and any of the section numbers as options as reported by paulg on current-users.
|
1.42 |
| 14-Apr-2019 |
abhinav | Set the snippet_length field of the callback_args
Because of this field not being set, apropos was failing to show snippet when piped to a pager or when used with -p argument.
|
1.41 |
| 07-Mar-2019 |
christos | fix memory allocation problems detected by jemalloc...
|
1.40 |
| 25-Nov-2017 |
abhinav | branches: 1.40.4; Encapsulate all the arguments required by the query callback function in a struct.
If we want to add or remove arguments from the callback functions, it requires changing the callback interface all over the place. By letting the callback simply expect a single struct argument, it would clean things up a bit.
ok christos
|
1.39 |
| 01-Aug-2017 |
abhinav | Don't use the custom tokenizer when compiled with debugging on
Using the custom tokenizer means one cannot interactively query the database through the SQLite shell, thus thwarting the purpose of the debug build option.
Thanks to leot@ for reporting it.
(While there change the debug macro from DEBUG to APROPOS_DEBUG)
|
1.38 |
| 18-Jun-2017 |
abhinav | Add a custom tokenizer which does not stem certain keywords.
Which keywords should not be stemmed is specified in the nostem.txt file. (Right now I have taken all the man page names, split them if they had underscores, removed common English words and converted everything to lowercase.)
The tokenizer itself is based on the Porter stemming tokenizer shipped with Sqlite. The code in custom_apropos_tokenizer.c is copy of that code with some modifications to prevent stemming keywords specified in nostem.txt.
Additionally, it now uses underscore `_' also as a token delimiter. Therefore, now it's possible to do query for `lwp' and all `_lwp_*' man page names will be matched. Or the query can be `unconst' and `__UNCONST' will be matched. This was not possible earlier, because underscore was not a delimiter and therefore the index would have __UNCONST as a key rather than UNCONST.
The tokenizer needs fts3_tokenizer.h file, which is not shipped with the amalgamation build of Sqlite, therefore it needs to be added here (unless we decide there is a better place for it).
To enforce using the new tokenizer, a schema version bump is needed
Since the tokenization is done both at the indexing time (via makemandb) and also while query time (via apropos or whatis), it will be needed to bump the schema version everytime nostem.txt is modified. Otherwise the index will consist of old tokens and desired changes will not be seen with apropos.
This should also fix the issue reported in PR bin/46255. Similar suggestion was also made on tech-userlevel@ recently: <http://mail-index.netbsd.org/tech-userlevel/2017/06/08/msg010620.html>
Thanks to christos@ for multiple rounds of reviews of the tokenizer code.
|
1.37 |
| 01-May-2017 |
abhinav | Simplify handling of the section arguments in apropos(1).
Earlier, a white space separated string was generated containing all the section numbers passed through command line arguments. Later on that would have to be tokenized and processed. Instead of that, use a NULL terminated array of strings.
Thanks to christos@ for reviewing and suggesting further improvements.
|
1.36 |
| 30-Apr-2017 |
abhinav | Simplify
|
1.35 |
| 30-Apr-2017 |
abhinav | Instead of dereferencing the pointer passed in as function argument, use a temporary local buffer. Saves the cost of pointer dereferencing at so many places.
|
1.34 |
| 30-Apr-2017 |
abhinav | Update the comment to be in sync with the code.
|
1.33 |
| 30-Apr-2017 |
abhinav | Use sqlite3_mprintf() to generate SQL query instead of asprintf.
|
1.32 |
| 27-Apr-2017 |
abhinav | Disable the database compression if DEBUG is defined.
When debugging makemandb(8), it helps to be able to view the text being stored in the database.
|
1.31 |
| 23-Apr-2017 |
abhinav | branches: 1.31.2; Better handle MLINKS in apropos(1).
apropos(1) only indexes the first .Nm entry from the NAME section in the full text index. Rest of the .Nm entries are stored in a separate table: mandb_links.
Till now apropos(1) did not use the mandb_links table. So whenever a query was being made for one of the man page links, such as realloc(3), it was showing malloc(3) in the results but not as the first result. And, also the result would show up as malloc(3), rather than realloc(3) (which can be confusing).
With this change, for single keyword queries, apropos(1) would now utilise the mandb_links table as well. If the query is for one of the links of a man page, it would show as the first result. Also, the result would show up as the name of the link rather than the original man page name. For example, if the query was for realloc, the output would be realloc(3), rather than malloc(3).
Following are some example queries showing difference in the output before this change and after this change:
#Before changes $ apropos -n 5 -M realloc reallocarr (3) reallocate array reallocarray (3) reallocate memory for an array of elements checking for overflow fgetwln (3) get a line of wide characters from a stream fgetln (3) get a line from a stream posix_memalign (3) aligned memory allocation
#After changes $ ./apropos -n 5 -M realloc realloc (3) general memory allocation operations realloc (3) general purpose memory allocation functions realloc (9) general-purpose kernel memory allocator reallocarr (3) reallocate array reallocarray (3) reallocate memory for an array of elements checking for overflow
#Before changes $ apropos -n 5 -M TAILQ_REMOVE SLIST_HEAD (3) implementations of singly-linked lists, lists, simple queues, tail queues, and singly-linked tail queues
#After changes $ ./apropos -n 5 -M TAILQ_REMOVE TAILQ_REMOVE (3) implementations of singly-linked lists, lists, simple queues, tail queues, and singly-linked tail queues
#Before changes $ apropos -n 5 -M falloc filedesc (9) file descriptor tables and operations file (9) operations on file entries
#After changes $ ./apropos -n 5 -M falloc falloc (9) file descriptor tables and operations file (9) operations on file entries
ok christos@
|
1.30 |
| 10-Jan-2017 |
kamil | Include <unistd.h> for R_OK W_OK STDOUT_FILENO access(2)
These symbols are undefined after switch to new zlib.
|
1.29 |
| 03-Oct-2016 |
abhinav | Mark the section and md5_hash columns as unindexed in the FTS table, as they are not used for search
|
1.28 |
| 06-Jul-2016 |
abhinav | branches: 1.28.2; Fix an off by one issue when concatenating strings.
|
1.27 |
| 06-Jul-2016 |
abhinav | Fix possible buffer overflow when concatenating strings. Patch from christos@
|
1.26 |
| 01-Jun-2016 |
abhinav | Refactor the function for executing the search SQL query into two parts.
One part is responsible for generating the SQL query The other part is responsible for executing the generated query.
While there, also remove a comment which is not valid anymore. And, don't call the snippet function when doing legacy mode search as we are not using the full text feature there.
|
1.25 |
| 24-Apr-2016 |
christos | CID 1358675: Wrong variable test
|
1.24 |
| 13-Apr-2016 |
christos | PR/51062: Abhinav Upadhyay: Allow non numeric sections to be indexed and searched by apropos(1). Fold long lines.
|
1.23 |
| 13-Apr-2016 |
christos | PR/51038: Abhinav Upadhyay: check for access permissions to the sqlite database
|
1.22 |
| 31-Mar-2016 |
christos | PR/51025: Abhinav Upadhyay: Remove unused includes from apropos-utils.c
|
1.21 |
| 24-Mar-2016 |
christos | PR/51004: Abhinav Upadhyay: apropos html mode doesn't handle especial characters in the short description
|
1.20 |
| 20-Mar-2016 |
christos | PR/50460: Abhinav Upadhyay: Fix legacy apropos query to match both the name and the one line description and delete extra args.
|
1.19 |
| 03-Dec-2015 |
christos | CID 1341551: Don't bother formatting if ti == NULL
|
1.18 |
| 23-Nov-2015 |
christos | PR/50344: Stephen Fisher: apropos shows formatting on console with vt100 term type. Can't print terminfo sequences directly; need to process them with ti_puts() to handle padding. This removes the padding delays, and stricly could break on slow terminal hardware, but they way the code is structured makes it impossible to fix properly (since the formatting strings are passed in the query). XXX: pullup-7
|
1.17 |
| 18-Oct-2014 |
snj | src is too big these days to tolerate superfluous apostrophes. It's "its", people!
|
1.16 |
| 01-Aug-2014 |
wiz | branches: 1.16.2; Fix an off by one bug in apropos. The bug is in the html output where some garbage characters are seen in the context match output.
From Abhinav Upadhyay in PR 49058.
|
1.15 |
| 02-Apr-2013 |
christos | branches: 1.15.4; instead of having a format and no format flag, and exposing various formatters, provide a format enum and expose html formatting too.
|
1.14 |
| 29-Mar-2013 |
christos | fix legacy mode in pager filter. (don't ul format if we are not formatting).
|
1.13 |
| 29-Mar-2013 |
christos | - Fix legacy mode to use like instead of match. This loses ranking. - default to unlimited lines - fix formatting of legacy mode
|
1.12 |
| 29-Mar-2013 |
christos | - If the stdout is not a tty, prevent formatting unless forced with -i - Don't ever page unless asked for with -p - Introduce "legacy mode" (-l) 1. searches only name and name_desc, prints name(section) - name_description 2. turns off escape formatting (can be forced on with -i) 3. turns off context printing (can be forced on with -c) - Parse the environment $APROPOS variable as an argument vector.
With these changes one can simply 'export APROPOS=-l' and get the old apropos behavior.
|
1.11 |
| 10-Feb-2013 |
christos | add -r flag to elide tty formatting
|
1.10 |
| 10-Feb-2013 |
christos | remove trailing whitespace
|
1.9 |
| 14-Jan-2013 |
christos | - move the terminal handling in apropos-utils.c since htmp and pager are also handled there. - underline the name, section, and description so that it is prettier. - change to bold terminal the terminal highlighting to match with less
|
1.8 |
| 14-Jan-2013 |
christos | Since mdocml decided to name headers that conflict with system ones (term.h) move the header inclusion one up.
|
1.7 |
| 06-Oct-2012 |
wiz | Make mandb path configurable. makemandb (and related tools) use the path from the _mandb variable from man.conf now.
Set _mandb in man.conf to same value as was used before.
From Abhinav Upadhyay <er.abhinav.upadhyay@gmail.com>.
|
1.6 |
| 10-May-2012 |
joerg | branches: 1.6.2; KNF
|
1.5 |
| 07-May-2012 |
wiz | PR 46419 by Abhinav Upadhyay using his updated patch: Clean up after removing man page aliases.
|
1.4 |
| 15-Apr-2012 |
wiz | branches: 1.4.2; Handle pages with slashes in their names better. From Abhinav Upadhyay in private mail.
|
1.3 |
| 07-Apr-2012 |
apb | Add the result from sqlite3_errmsg() to some error messages. Now we can get "apropos: Unable to query schema version: database is locked" instead of just "apropos: Unable to query schema version".
|
1.2 |
| 07-Feb-2012 |
joerg | branches: 1.2.2; Fix C&P error with $NetBSD$
|
1.1 |
| 07-Feb-2012 |
joerg | Import the new apropos/whatis.
This code has been developed by Abhinav Upadhyay as part of Google's Summer of Code 2011. It uses libmandoc to parse man pages and builds a Full Text Index in a SQLite database. The combination of indexing the full manual page, filtering out stop words and ranking individual matches based on the section gives a much improved user experience.
The old makewhatis and friends are kept under MKMAKEMANDB=no for now.
|
1.2.2.2 |
| 09-May-2012 |
riz | Pull up following revision(s) (requested by wiz in ticket #229): usr.sbin/makemandb/makemandb.c: revision 1.9 usr.sbin/makemandb/DBSCHEMA: revision 1.2 usr.sbin/makemandb/apropos-utils.c: revision 1.5 usr.sbin/makemandb/apropos-utils.h: revision 1.3 PR 46419 by Abhinav Upadhyay using his updated patch: Clean up after removing man page aliases.
|
1.2.2.1 |
| 19-Apr-2012 |
riz | Pull up following revision(s) (requested by wiz in ticket #186): usr.sbin/makemandb/apropos.c: revision 1.6 usr.sbin/makemandb/apropos-utils.c: revision 1.3 usr.sbin/makemandb/apropos-utils.c: revision 1.4 Add the result from sqlite3_errmsg() to some error messages. Now we can get "apropos: Unable to query schema version: database is locked" instead of just "apropos: Unable to query schema version". Handle pages with slashes in their names better. From Abhinav Upadhyay in private mail.
|
1.4.2.6 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.4.2.5 |
| 23-Jan-2013 |
yamt | sync with head
|
1.4.2.4 |
| 30-Oct-2012 |
yamt | sync with head
|
1.4.2.3 |
| 23-May-2012 |
yamt | sync with head.
|
1.4.2.2 |
| 17-Apr-2012 |
yamt | sync with head
|
1.4.2.1 |
| 15-Apr-2012 |
yamt | file apropos-utils.c was added on branch yamt-pagecache on 2012-04-17 00:09:49 +0000
|
1.6.2.4 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.6.2.3 |
| 23-Jun-2013 |
tls | resync from head
|
1.6.2.2 |
| 25-Feb-2013 |
tls | resync with head
|
1.6.2.1 |
| 20-Nov-2012 |
tls | Resync to 2012-11-19 00:00:00 UTC
|
1.15.4.1 |
| 10-Aug-2014 |
tls | Rebase.
|
1.16.2.1 |
| 15-Apr-2016 |
snj | Pull up following revision(s) (requested by christos in ticket #1142): usr.sbin/makemandb/apropos-utils.c: revisions 1.18, 1.19 CID 1341551: Don't bother formatting if ti == NULL -- PR/50344: Stephen Fisher: apropos shows formatting on console with vt100 term type. Can't print terminfo sequences directly; need to process them with ti_puts() to handle padding. This removes the padding delays, and stricly could break on slow terminal hardware, but they way the code is structured makes it impossible to fix properly (since the formatting strings are passed in the query).
|
1.28.2.3 |
| 26-Apr-2017 |
pgoyette | Sync with HEAD
|
1.28.2.2 |
| 20-Mar-2017 |
pgoyette | Sync with HEAD
|
1.28.2.1 |
| 04-Nov-2016 |
pgoyette | Sync with HEAD
|
1.31.2.1 |
| 02-May-2017 |
pgoyette | Sync with HEAD - tag prg-localcount2-base1
|
1.40.4.2 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.40.4.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.45.2.1 |
| 03-Jun-2022 |
martin | Pull up following revision(s) (requested by gutteridge in ticket #1461):
usr.sbin/makemandb/apropos.1: revision 1.19 usr.sbin/makemandb/apropos.c: revision 1.25 usr.sbin/makemandb/apropos.c: revision 1.26 usr.sbin/makemandb/apropos.1: revision 1.20 usr.sbin/makemandb/apropos.1: revision 1.21 usr.sbin/makemandb/apropos.1: revision 1.22 usr.sbin/makemandb/apropos.1: revision 1.23 usr.sbin/makemandb/apropos-utils.c: revision 1.46 usr.sbin/makemandb/apropos-utils.c: revision 1.47 usr.sbin/makemandb/apropos-utils.c: revision 1.49
PR/54343: Prevent NULL pointers in callback strings; use "*?*" for now to identify them.
PR bin/54343: We want the callback_args.machine to be NULL if it is not present in the DB.
The previous commit fixed the problem of allowing apropos to not crash and produce output even if the database is missing values for certain mandatory fields, such as name, section etc. Normally we don't expect those values to be missing in the database but in case of parsing errors it can happen.
However, the machine architecture is an optional field since not all man pages are hardware specific so that should be allowed to be set to NULL if not present in the database.
apropos.c: fix pager functionality
Issue reported by Rocky Hotas on NetBSD-Users, patch input from RVP on same, adjustments by me.
apropos.1: document the PAGER environment variable
apropos(1): use proper -width
apropos(1): use proper -width for the list of options too
apropos(1): Tweak the description of -1, ... -9, and -s
-s is not for compatibility only, because section names can be anything. E.g. we have 3lua and 9lua in base. We have rudiments of 3f (for FORTRAN libs). Some packages in pkgsrc also use suffixed 1 and 3 sections.
apropos(1): Use the official spelling for "SQLite". While here, use .Bx to refer to 3BSD.
apropos(1): improve error handling in edge cases Patch from RVP on NetBSD-Users, with an additional comment tweak by me.
Summary from RVP: 1. Ignore SIGPIPE so that we're not killed in the middle of some DB operation by a botched $PAGER: $ env PAGER=3D/non-existent apropos -p ... 2. Return proper exit status in case of write errors: $ apropos ... >/dev/full || echo fail
|