-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support AbstractFilePath
#192
base: master
Are you sure you want to change the base?
Conversation
Suppresses redundant patterns warnings.
Wrt aeson, also see: Can you point me to the places where you use I'm on my phone and PR reviews are hard. Generally, we shouldn't need to ever assume an encoding. Using |
@mmhat I'll gladly review this and probably get it merged eventually, but I need some time to review this carefully. |
@mmhat I just tried building and testing this locally. Could you get |
- Removed 'OsPath.Internal.toFilePath' - Assume Unicode encoding for Aeson type class instances - Stack build works with --pedantic flag
I this blog post a while ago, but I really forgot that JSON marshalling is covered there!
Ah, yes, you are right; I changed the JSON instances now such they assume Unicode encoding for the underlying filepath.
Done. |
src/OsPath/Internal/Include.hs
Outdated
instance ToJSON (Path b t) where | ||
toJSON = | ||
either (error . displayException) toJSON | ||
. OsPath.decodeUtf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid the error
, but it's up to debate:
- try
decodeUtf
(equals utf8 on unix and utf16le on windows) - in case of failure, fall back to
latin1
on unix anducs-2
on windows, which both cannot fail (morally... the types ofdecodeWith
lie in this case)
This is what I dislike about aeson. You can't really specify different encodings for the same type, unlike waargonaut.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fine to me.
@hasufell Does your comment also apply to the encoding functions (e.g. encodeWith ucs2le
)? Also, I found System.OsString.Encoding.ucs2le
, but no TextEncoding
for latin1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://hackage.haskell.org/package/base-4.20.0.1/docs/System-IO.html#v:latin1
encodeWith
would error on Chars outside of the range. There is very few, maybe no Encoding that is total under encodeWith
, since you can represent Surrogates as Chars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://hackage.haskell.org/package/base-4.20.0.1/docs/System-IO.html#v:latin1
Thanks; I missed that System.IO.TextEncoding
and GHC.IO.Encoding.Types.TextEncoding
are the same. The latter is linked in the Haddocks in System.OsString.Encoding
, e.g. here.
Regarding the FromJSON/ToJSON
instances: The more I think about it the less I am convince that they should be in this library in the first place - Or at least not brought in scope by OsPath
. There is clearly more than one sensible instance and throughout the ecosystem that is dealt with by leaving the instance definition to the user of a library; For example there are no Semigroup/Monoid
instances for Int
for that reason.
It is of course convenient to have some pre-defined instances at hand that cover most of of the use cases, but ideally those would be provided by another (public sub-) library. Personally I don't make use of those instance more often than not, and pulling in aeson
as a dependency is in my projects often unnecessary. So ultimately I'd like to get rid of the (mandatory) aeson
dependency at some point.
As for this PR, I'm inclined to move the JSON-related instances for OsPath
to an own module, say OsPath.Aeson
, and add a flag aeson-instances
to path
that controls whether any JSON-related instances are build at all (defaults to true
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for this PR, I'm inclined to move the JSON-related instances for
OsPath
to an own module, sayOsPath.Aeson
, and add a flagaeson-instances
topath
that controls whether any JSON-related instances are build at all (defaults totrue
).
cabal flags must not control exposed API though, because a package can not specify "I need this flag of one of my dependencies enabled". Cabal has no means to specify this, so the end user may specify an invalid configuration and get a compile error with missing modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, you are right indeed. And frankly I should've known better before making such a proposal, given how much time I spent trying to build GHCup before I figured that I have to set a flag for streamly
... I'm quiet happy that this changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've had a discussion about aeson before:
#136
If there is no good aeson instance (for example because it needs IO) then I'd rather not provide any, but honestly that'd be a bit weird :P.
, (1, elements (map ord "./\\")) | ||
] | ||
) | ||
shrinkValid _ = [] -- TODO: Not yet implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we have shrinking before?
This will be important if/when we run into bugs but it's only a maintenance burden so maybe not so important for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we did; But the one for FilePath
is (obviously) the the same as the one for the general-purpose String
, which is provided by the genvalidity
package. Since there are no pre-defined instances for OsPath
, OsString
or OsChar
and we have to roll our own, I thought that we might reconsider how we shrink paths specifically.
I didn't find the time to do that, but I didn't want to blindly copy existing code either; Hence the TODO...
But you are right, before we merge this I want to provide an implementation one way or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revisiting the genValid
implementation as well, this looks wrong to me as well: IMHO the generator should not only produce Unicode paths, but random sequences of Word8
/Word16
separated by /
, \
and .
.
I.e. we should not assume an encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing a meaningful generator for windows paths is incredibly hard.
filepath
has done this by expressing them in ABNF.
See here: https://github.com/haskell/filepath/blob/master/tests/filepath-equivalent-tests/Gen.hs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, that looks horrifying! Thanks for the pointer though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, that looks horrifying! Thanks for the pointer though.
It is beautiful, because it's a pretty good ABNF. I'm planning to actually represent filepaths through that data structure at some point. Otherwise it's not possible to have correct splitting etc. Lots of functions in filepath are ad-hoc.
shrinkValid _ = [] -- TODO: Not yet implemented | ||
|
||
instance Validity PLATFORM_PATH where | ||
validate = trivialValidation -- TODO: Not yet implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely looks incomplete. I'd be surprised if these were trivially valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I simply don't know (yet) what constitutes a valid PosixPath
/WindowsPath
... See also my comments in the previous thread.
@hasufell Do you have any idea what a sensible notion would be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends what we mean with "valid". Do you mean the isValid
function from filepath? It's a bit fuzzy. The only thing you need to keep in mind additionally for the new OsPath API when dealing with the constructors yourself is that windows paths are expected to be [Word16]
. So if you create a one-byte WindowsString
manually, almost all functions will throw error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I meant was answered here: https://stackoverflow.com/questions/1976007/what-characters-are-forbidden-in-windows-and-linux-directory-names
This is not a problem for the instance in the test-suite of this library since we do not pass the generated paths to any OS API, but if we move the instance to a more general-purpose service package like genvalidity-filepath
we have to implement those restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As some of those SO answers say: it's almost impossible to have an exhaustive list of invalid filepaths on windows. The filepath isValid
is just best effort.
@mmhat Great job on this work. It must have been quite a fiddly thing to get right, but you got I had a look through the changes and don't see anything disagreeable. I wasn't confident so I will just ask you: Can you please double-check that the original |
Thank you!
I am fairly certain that none got removed but more were added, but I will double-check that and provide a summary of what has changed 👍 Regarding our earlier discussion about |
…om Path include file
Absolutely! EDIT: I'd be happy to guide you in that. |
- Also started with documentation. - Expose 'OsPath.fromSomeBase' - Require aeson >=1.0.0.0 to avoid tons of CPP
@mmhat If you're ever blocked on me, please ping me. |
This PR adds a new namespace
OsPath
to the library which contains a version ofPath
that is a newtype wrapper around theSystem.OsPath
filepaths. As a consequence all functions that took aFilePath
as an argument or produced on, or that took/produced aString
(e.g. extensions) now use the counterparts fromSystem.OsPath
andSystem.OsString
respectively.toFilePath
is an exception to this, it still gives you aSystem.FilePath.FilePath
.For the implementation I aimed to stay as close to the existing
Path
modules as possible, that's why there are for example also ports of the deprecated functions included.As it stands this PR consists of two parts: The first commits are cleanup/refactoring commits of the code that already exists.
Those happened either because something was missing (
hie.yaml
) or something got in my way (.ghci.conf
) or - the biggest part - as a preliminary work before the replication of the various modules. For example, I refactored thetest
testsuite such that the copy could be easier adapted to the newOsPath
hierarchy.There are some ugly parts where I am not entirely sure if I found the right solution - in particular the
aeson
instances and everything involving conversion from one encoding to another, where I had to useunsafeDupablePerformIO
at some places.Gently pinging @hasufell @mpilgrem (I noticed to late that you started working on this issue too) and @NorfairKing since I am interested in your feedback specifically.
Fixes #189