-
Notifications
You must be signed in to change notification settings - Fork 764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Change]修改目前 Paddle docs 的代码检查方式 #5962
Conversation
感谢你贡献飞桨文档,文档预览构建中,Docs-New 跑完后即可预览,预览链接:http://preview-pr-5962.paddle-docs-preview.paddlepaddle.org.cn/documentation/docs/zh/api/index_cn.html |
@@ -83,6 +84,23 @@ def find_all(src_str, substr): | |||
return indices | |||
|
|||
|
|||
def strip_ps1_from_codeblock(codeblock): | |||
"""strip PS1(>>> ) from codeblock""" | |||
mo = re.search(r"\n>>>\s?", "\n" + codeblock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我很好奇,mo
是什么缩写?match_obj
(个人习惯 re 匹配的结果使用 match_obj
)?原来的代码里很多不易读的缩写就不要沿用了,不易于维护
|
||
codeblock_clean = [] | ||
for line in codeblock.splitlines(): | ||
mo = re.match(r"^>>>\s?", line.lstrip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...
也是合法的前缀吧,这里没有考虑这种情况嘛?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后面写示例的时候,建议规定都用 >>>
,这也是 xdoctest 的 Enhancements 之一 new relaxed (and backwards-compatible) syntax
所以这里只做这个处理~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我是觉得,既然引入 xdoctest,那么就按照新方式来做,不然 >>>
与 ...
混用的话又回去了,后面如果不用 xdoctest了,改起来又是一个大工程...
至于图片里面的这个 case
,不支持~ 不是我这边 strip_ps1_from_codeblock
不支持,是目前代码检查的 extract_code_blocks_from_docstr
不支持 multiline-string
。
def extract_code_blocks_from_docstr(docstr, google_style=True):
"""
extract code-blocks from the given docstring.
DON'T include the multiline-string definition in code-blocks.
The *Examples* section must be the last.
...
后面我改一下兼容 ...
吧 ~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
至于图片里面的这个 case,不支持~
支持与不支持,不仅要在文档里说明,还应该在 Paddle 那边的 CI 有所体现,不然开发者的代码段明明是合法的,xdoctest 都过了,Paddle 那边 CI 也过了,但是之后更新 docs 的人却发现这边 CI 挂了,这是不能接受的
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paddle 那边也不支持 ...
Paddle 那边的 sampcd_processor.py
:
def extract_code_blocks_from_docstr(docstr):
"""
extract code-blocks from the given docstring.
DON'T include the multiline-string definition in code-blocks.
The *Examples* section must be the last.
Args:
Paddle 那边的代码检查逻辑,与 docs 下面的 gen_doc
基本是一样的,开发的时候应该是代码直接 copy 过来的~
我在 rfc 里面有说~
docs/api/test/test_gen_doc.py
Outdated
if codeblock.get("in_examples") | ||
][0]["codes"] | ||
|
||
assert codeblock == target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
之后相关例子直接用 Markdown 写在描述吧,这种形式有点不太易读
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是 docstrings 定义的这些例子吗?写在哪个地方的描述里?
docstrings 这些东西写在外面,是因为代码段我想用 """
把代码包起来,看起来相对直观一点~
如果写函数描述里面的话,那我把他们拉直了吧~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是 docstrings 定义的这些例子吗?写在哪个地方的描述里?
PR 的描述里就行
docs/api/gen_doc.py
Outdated
@@ -1025,6 +1025,23 @@ def filter_out_object_of_api_info_dict(): | |||
del api_info_dict[id_api]['object'] | |||
|
|||
|
|||
def strip_ps1_from_codeblock(codeblock): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不能复用上面的嘛?一定要 copy 嘛?
如果一定要 copy 的话,需要说明另一个在哪个位置,确保两者修改同步
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... 这个地方确实不知道怎么处理好 ...
ci_scripts 和 docs/api 都有代码检查的接口,可实现的还不一样,感觉这俩就是两次单独写出来的,谁也不挨着谁,复用的话感觉更别扭,所以这里就直接 copy 代码过来了 ...
你看咋整?🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这两个之间是什么关系呢?为什么两个地方都要改呢?
如果真的不能复用的话加上注释说明两者需要同步即可
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ci_scripts
与docs/api
既然是各自维护的(目前的状况),那么各自修改对应的方法应该是必须的。
那需要保证当前代码写得足够鲁棒才行,只要有问题,将来维护起来是很麻烦的(虽然现在的代码写的已经维护不起来了)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤨 我也不知道为啥当初在 ci_scripts 和 docs/api 都有代码检查,还逻辑不一样... docs 这个里面的 python 脚本似乎就没有打算作为 module 来设计 ...
算了,我用 sys.path.append(xxx)
来改一下吧 ~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
算了,我用 sys.path.append(xxx) 来改一下吧 ~
还是加注释吧,sys.path.append 也很奇怪
Update 20230625
最好的办法我觉得是,合并这两部分的逻辑,要不然怎么改都是打补丁 ... 😂 另外,test 如果最后其他问题都没有的话,我还是把它删掉再合并吧~ |
Update 20230626
注意,此方法存在以下问题: rfc 中说过,由于 xdoctest 可以使用 >>> for i in range(3):
>>> if i == 0:
>>> print('ok')
>>> print(i)
...
0
1
2
... 是可以检查通过的,但是,如果一旦混用 >>> for i in range(3):
... if i == 0:
... print('ok')
... print(i)
...
0
1
2
... 则 xdoctest 无法正常检查,因为无法判断 特此说明~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
虽然整体逻辑略显复杂,但代码风格还是不错的,我没有仔细考虑这个问题,不确定自己能否写出更优的代码
因此这个 PR 先 merge 吧,单测可以删除一下了~
请评审!:) |
PR types
New features
PR changes
APIs
Description
中国软件开源创新大赛:飞桨框架任务挑战赛
赛题五:将 xdoctest 引入到飞桨框架工作流中
RFC [Add]将 xdoctest 引入到飞桨框架工作流中v1
ISSUE 赛题五:将 xdoctest 引入到飞桨框架工作流中 Tracking Issue
第一阶段
任务:修改目前 Paddle docs 的代码检查方式
这个任务的目的是,兼容后续带有
>>>
的 codeblock,基本逻辑是:>>>
,则认为是新样式的代码段,为了兼容其代码检查,将>>>
开头的行认为是代码,将其剥离,而其他的行则认为是输出或者描述,添加#
。如此这般,则转换后的代码写入文件后,仍可以运行并检查。涉及文件:
ci_scripts/chinese_samplecode_processor.py
增加strip_ps1_from_codeblock
方法。docs/api/gen_doc.py
增加strip_ps1_from_codeblock
方法。@SigureMo @luotao1 @jzhang533 @sunzhongkai588
请评审,谢谢!:)