-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[C++] Check for case insensitive keywords #8420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
397926d to
0aabe43
Compare
|
Sorry for the CLA notice. I mistakenly used the wrong git account. Rebased and now should be fixed. |
|
This pull request is stale because it has been open 6 months with no activity. Please comment or label |
|
It is still relevant. |
|
This pull request is stale because it has been open 6 months with no activity. Please comment or label |
|
Still relevant |
|
@razvanalex - would you please rebase this and take care of the merge conflicts? I've added it to my list to review and try to get merged! |
bfef3eb to
1d3835d
Compare
|
@jtdavis777, thanks! It should be ready for review now. I also rebased my other branches that add the remaining PHP functionality (finishSizePrefixed*, object API, and optional scalars), which depend on this PR. I will open PRs for those as well soon. |
Most languages are not affected by this change. In PHP, some names such as Bool cannot be used because it is a reserved keyword for to the bool data type. The field `keywords_casing` in the configs enables checking all characters in lowercase against every keyword. This should be safe as flatbuffers does not allow non-ASCII characters in its grammar.
1d3835d to
034e10c
Compare
|
@razvanalex before I merge this, a quick check -- for PHP (and other languages which might use this feature), is this a possible overcorrection? i.e. are there only a few keywords in PHP that have this issue, but we are enabling the feature for all of them? |
|
At least for PHP, the specification states that keywords are not case-sensitive: https://github.com/php/php-langspec/blob/master/spec/09-lexical-structure.md#keywords I don't know if there is a language in which some keywords should be used with sensitive casing (e.g., if, for, while) and others with insensitive casing (e.g., bool/Bool, int/Int, float/Float). If someone knows such language, please let me know :) Worst, it will be a bit annoying to have "_" at the end of a keyword-like class name. However, this change will should not affect the rest of (actively maintained) languages. Another solution would be to keep both lists separate (one for case sensitive keywords, and one for case insensitive keywords), but I feel it would increase complexity a bit, to manage both lists. |
jtdavis777
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM - will approve once comment is resolved!
|
@razvanalex gotcha! this is fine -- I just wanted to make sure we weren't being unnecessarily restrictive. This looks great. |
| if (keywords_.find(name) == keywords_.end()) { | ||
| std::string cased_name(name); | ||
|
|
||
| if (config_.keywords_casing == Config::KeywordsCasing::CaseInsensitive) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you put into a method?
Also, prefer to use an existing function: https://github.com/google/flatbuffers/blob/master/src/util.cpp#L130
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I added NormalizeKeywordCase to handle this and used ConvertCase(name, Case::kAllLower) instead.
I have now noticed an edge case in Namer. If config_.keywords_casing is set to CaseInsensitive but the provided keywords set contains uppercase strings (e.g., NULL), the EscapeKeyword check fails.
I could update the Namer constructor (https://github.com/razvanalex/flatbuffers/blob/b4cdffa92796ab49c387e942ddbb5f056d328bf6/include/codegen/namer.h#L117) to convert to lowercase when KeywordsCasing::CaseInsensitive is set, but I feel this is counterintuitive to those who write new generators, since they may use upper/mixed case on purpose (e.g., for historic reasons). It is not an issue now, as I want to use the feature only for PHP. Should I address this case in the constructor? I am asking because the other solution I can think of is to use a private member to store the set processed, or, simpler, leave a comment in the code, and if someone needs it, then implement it.
Adds check against case insensitive keywords. Most languages are not affected by this change.
In PHP, some names such as
Boolcannot be used because it is a reserved keyword for thebooldata type. Adding the fieldkeywords_casingin the configs enables checking in lowercase of all characters against every keyword. This should be safe as flatbuffers does not allow non-ASCII characters in its grammar.This feature will be used in future PRs to improve PHP support in flatbuffers.