Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search feature #1

Merged
merged 1 commit into from
Sep 18, 2024
Merged

Search feature #1

merged 1 commit into from
Sep 18, 2024

Conversation

tuanngocnguyen
Copy link
Contributor

@tuanngocnguyen tuanngocnguyen commented Sep 17, 2024

CLI script

WIP: Still working on regex-match

Search text throughout the whole database.

Options:
--search=STRING                  String to search for.
--regex-match=STRING             Use regular expression to match the search string.
--tables=tablename:columnname    Tables and columns to search. Separate multiple tables/columns with a comma.
                                 If not specified, search all tables and columns.
                                 If specify table only, search all columns in the table.
                                 Example: 
                                    --tables=user:username,user:email
                                    --tables=user,assign_submission:submission
                                    --tables=user,assign_submission
--summary                        Summary mode, only shows column/table where the text is found.
                                 If not specified, run in detail mode, which shows the full text where the search string is found.
-h, --help                       Print out this help.

Example:
\$ sudo -u www-data /usr/bin/php admin/tool/advancedreplace/cli/find.php --search=thelostsoul --summary
\$ sudo -u www-data /usr/bin/php admin/tool/advancedreplace/cli/find.php --regex-match=thelostsoul --summary

@tuanngocnguyen tuanngocnguyen force-pushed the search_feature branch 4 times, most recently from 5fb3bd7 to be98c28 Compare September 18, 2024 02:12
@tuanngocnguyen tuanngocnguyen changed the title DRAFT: Search feature Search feature Sep 18, 2024
README.md Outdated
- Find all occurrences of "http://example.com/" followed by any number of digits on tables:

`php admin/tool/advancedreplace/cli/find.php --regex-match=http://example.com/\\d+`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you "" bash quote the params would this make the \\ just \ and read easier?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix this

cli/find.php Outdated
if (!empty($matches[0])) {
// Show the result foreach match.
foreach ($matches[0] as $match) {
echo "$table, $column, $id, \"$match\"\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please use either the php or the moodle apis for writing csv so this is escaped properly for content when it contains commas and quotes

* @param int $limit The maximum number of results to return.
* @return array
*/
public static function search(string $search, bool $regex = false, string $tables = '', int $limit = 0): array {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we scale up I'm pretty sure this will start breaking as everything is in memory

So probably fine to merge right now, but we'll need to refactor this later. We will probably want to do things in two phases, the first phase returns a list of tables we will search, and then we'll want to do a search table by tables and that second layer will return a record set we can stream through

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also thinking ahead, we talked about this script just outputting to stdout, but it might be better if it outputs to a file and then we can output a set of progress bars to stdout. Knowing how long things will take and where we are up to with ETA's is going to be fairly important. Again not critical now but I'll make issues for this

@tuanngocnguyen tuanngocnguyen force-pushed the search_feature branch 6 times, most recently from 80e605d to 2b2ef2d Compare September 18, 2024 03:01
@brendanheywood brendanheywood merged commit fb908d4 into main Sep 18, 2024
7 of 24 checks passed
@brendanheywood brendanheywood deleted the search_feature branch September 18, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants