Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correctly handle options like '-f a.txt' #251

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 166 additions & 1 deletion src/tbox/utils/option.c
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ tb_bool_t tb_option_find(tb_option_ref_t option, tb_char_t const* name)
// find it
return tb_oc_dictionary_value(impl->list, name)? tb_true : tb_false;
}
tb_bool_t tb_option_done(tb_option_ref_t option, tb_size_t argc, tb_char_t** argv)
tb_bool_t tb_norm_option_done(tb_option_ref_t option, tb_size_t argc, tb_char_t** argv)
{
// check
tb_option_impl_t* impl = (tb_option_impl_t*)option;
Expand Down Expand Up @@ -565,6 +565,171 @@ tb_bool_t tb_option_done(tb_option_ref_t option, tb_size_t argc, tb_char_t** arg
// ok
return tb_true;//tb_option_check(impl);
}
static void tb_print_argv(tb_size_t argc, tb_char_t* argv[])
{
tb_trace_i("argc: %d", argc);
tb_size_t i = 0;
for (i = 0; i < argc; i++)
{
tb_trace_i("argv[%d]: %s", i, argv[i]);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这种 print 代码删了

tb_bool_t tb_option_done(tb_option_ref_t option, tb_size_t argc, tb_char_t** argv)
{
tb_bool_t ok = tb_false;
// check
tb_option_impl_t* impl = (tb_option_impl_t*)option;
tb_assert_and_check_return_val(impl && impl->list && impl->opts, tb_false);

// Prints the command-line arguments before normalization
// tb_trace_i("OLD:");
// tb_print_argv(argc, argv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

无关注释删了


// Normalize the parameters, i.e., transform the form '-f a.txt' into the form '-f=a.txt'
// Make a two-dimensional character array to hold command-line arguments
// The number of first dimensions will not exceed 'argc'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多行注释用 /* */

// The number of the second dimension will not exceed twice the number of the longest 'argv[i]' +1
// (why +1, because of the equal sign)
// Let's find out how long the longest argv[i] is
tb_size_t i = 0;
tb_size_t longest = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

中间不要空这么多

for (i = 0; i < argc; i++)
{
tb_size_t argvi_len = tb_strlen(argv[i]);
if (argvi_len > longest)
{
longest = argvi_len;
}
}

// Allocate space
tb_char_t* data = tb_malloc(argc * (2*longest+1+1)); // including the trailing '\0'
// Allocate space for normalized argv array. Array elements are of tb_char_t*
tb_char_t** new_argv = tb_malloc(argc * sizeof (tb_char_t*));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥解析个参数,还要两次分配。尽量保证 无分配解析

Copy link
Contributor Author

@duyanning duyanning Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其他地方我都可以按您的修改意见进行修改,
但这块不分配内存我好像搞不定。

为了尽可能地避免与您之前的实现在行为上产生细微差异,从而打破tbox用户的已有代码,
我选择了这样一种做法:
先将
-f a.txt -h file0 -f=b.txt file1 --config c.txt file2 --config=d.txt file3 -f -h file4
(其中-f、--config带参数,-h不带参数)
这种形式的命令行参数规范化为
-f=a.txt -h file0 -f=b.txt file1 --config=c.txt file2 --config=d.txt file3 -f -h file4
这种形式,(即给该加等号的地方加上等号)
然后直接调用您原来的接口进行解析。

我两次分配内存,都是为了不影响用户已有的代码。
一次是为规划化后的argv[]数组分配空间,
一次是为规范化后的argv[]数组各元素所指字符串分配空间。
规范化后的argv[]数组的元素个数可能少于原来的argv[]数组,
数组各个元素所指的字符串可能也不同于原来的原来的argv[]数组。

例如:
原来的argv[]数组长这样:

argc: 14
argv[0]: -f
argv[1]: a.txt
argv[2]: -h
argv[3]: file0
argv[4]: -f=b.txt
argv[5]: file1
argv[6]: --config
argv[7]: c.txt
argv[8]: file2
argv[9]: --config=d.txt
argv[10]: file3
argv[11]: -f
argv[12]: -h
argv[13]: file4

规范化后长这样:

argc: 12
argv[0]: -f=a.txt
argv[1]: -h
argv[2]: file0
argv[3]: -f=b.txt
argv[4]: file1
argv[5]: --config=c.txt
argv[6]: file2
argv[7]: --config=d.txt
argv[8]: file3
argv[9]: -f
argv[10]: -h
argv[11]: file4

Copy link
Contributor Author

@duyanning duyanning Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥解析个参数,还要两次分配。尽量保证 无分配解析

算了,我放弃。
我才发现对于类型为 TB_OPTION_TYPE_BOOL 的key,
其后可跟可不跟val,不跟默认true。
不分配内存,在原地解析,
复杂度超出我的预期,不想在这上面花时间了。

感谢ruki长久以来的艰辛工作。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那你关了吧,或者可以用一些现成的解析库,xmake-repo 上面一堆了。。

ruki-2:xmake-repo ruki$ xrepo search opt
The package names:
    opt:
      -> popt-1.19: C library for parsing command line parameters (in xmake-repo)
      -> docopt-v0.6.3: Pythonic command line arguments parser (C++11 port) (in xmake-repo)
      -> cxxopts-v3.1.1: Lightweight C++ command line option parser (in xmake-repo)
      -> cgetopt-1.0: A GNU getopt() implementation written in pure C. (in xmake-repo)
      -> cargs-v1.0.3: A lightweight cross-platform getopt alternative that works on Linux, Wind
ows and macOS. Command line argument parser library for C/C++. Can be used to parse argv and arg
c parameters. (in xmake-repo)

for (i = 0; i < argc; i++)
{
new_argv[i] = data + i*(2*longest+1+1);
}
tb_size_t new_argc = 0;

// Examine each element of the original 'argv' array one by one
for (i = 0; i < argc; i++)
{
// the argument
tb_char_t* p = argv[i];
tb_char_t* e = p + tb_strlen(p);
tb_assert_and_check_return_val(p && p < e, tb_false);

// Determine the kind of argv[i].
// 0 - long KEY;
// 1 - short KEY
// 2 - not KEY (It may be an independent VAL, or it may be a VAL attached to the previous KEY
// It depends on whether the previous KEY has VAL or not)
tb_size_t kind;
if (p + 2 < e && p[0] == '-' && p[1] == '-' && tb_isalpha(p[2]))
kind = 0;
else if (p + 1 < e && p[0] == '-' && tb_isalpha(p[1]))
kind = 1;
else
kind = 2;

// If it is is a KEY, you need to determine whether it is a simple KEY or a KEY with VAL
// based on the option definition
if (kind == 0 || kind == 1)
{
tb_char_t key[512] = {0};
tb_char_t* val;
tb_option_item_t const* find;
if (kind == 0)
{
// the long key
{
tb_char_t* k = key;
tb_char_t* e = key + 511;
for (p += 2; *p && *p != '=' && k < e; p++, k++) *k = *p;
}
// the val
val = (*p == '=')? (p + 1) : tb_null;
find = tb_option_item_find(impl->opts, key, '\0');
}
else
{
// the short key
{
tb_char_t* k = key;
tb_char_t* e = key + 511;
for (p += 1; *p && *p != '=' && k < e; p++, k++) *k = *p;
}
// the val
val = (*p == '=')? (p + 1) : tb_null;
find = tb_option_item_find(impl->opts, tb_null, key[0]);
}

// If it is a KEY with VAL
if (find->mode == TB_OPTION_MODE_KEY_VAL)
{
// Then determine whether VAL is in argv[i] or argv[i+1].
// If it's in argv[i], copy it directly to the new_argv array
// If it's in argv[i+1], copy argv[i]=argv[i+1] to the new_argv array
if (val)
{
tb_strcpy(new_argv[new_argc], argv[i]);
new_argc++;
}
else
{
// If argv[i+1] is also preceded by a '-',
// that is, another KEY, an error will be reported
if (i+1 < argc && argv[i+1][0]=='-')
{
tb_trace_e("%s: no option value '--%s='", impl->name, key);
tb_strcpy(new_argv[new_argc], argv[i]);
new_argc++;
}
else
{
tb_strcpy(new_argv[new_argc], argv[i]);
tb_strcat(new_argv[new_argc], "=");
tb_strcat(new_argv[new_argc], argv[i+1]);
i++;
new_argc++;
}
}
}
else
{
// If it is a simple KEY, copy it to the new_argv array
tb_strcpy(new_argv[new_argc], argv[i]);
new_argc++;
}

}
else
{
// If it's something else, copy directly to the new argv array.
// This one is necessarily an independent VAL,
// because the subsidiary VAL has already been skipped.
tb_strcpy(new_argv[new_argc], argv[i]);
new_argc++;
}

}

// Prints the normalized command-line arguments
// tb_trace_i("NEW:");
// tb_print_argv(new_argc, new_argv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删了


// This is the original tb_option_done,
// renamed to tb_normalized_option_done because it can only
// handle normalized options (i.e., options like '-f=a.txt').
// In the style of tbox, use norm instead of normalized
ok = tb_norm_option_done(option, argc, argv);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不能这么改,只能在 tb_option_done 一次性处理解析所有,不要分两个接口


tb_free(new_argv);
tb_free(data);

return ok;
}
tb_void_t tb_option_dump(tb_option_ref_t option)
{
// check
Expand Down
12 changes: 11 additions & 1 deletion src/tbox/utils/option.h
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,17 @@ tb_void_t tb_option_exit(tb_option_ref_t option);
*/
tb_bool_t tb_option_find(tb_option_ref_t option, tb_char_t const* name);

/*! done option
/*! done normalized option (normalized option is like '-f=a.txt')
*
* @param option the option
* @param argc the arguments count
* @param argv the arguments value
*
* @return tb_true or tb_false
*/
tb_bool_t tb_norm_option_done(tb_option_ref_t option, tb_size_t argc, tb_char_t** argv);
Copy link
Member

@waruqi waruqi Dec 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所以接口都是 tb_option_ 开头的,另外就一个参数解析,不要加新接口

而且这里改了名,还会 break 现有用户


/*! done option (also supports '-f a.txt')
*
* @param option the option
* @param argc the arguments count
Expand Down